Dual language Document Assistant: 'clean' bilingual table?
- michaelbeijer
-
Topic Author
- Offline
- Новый участник
-
Less
More
2 years 8 months ago #142
by michaelbeijer
Hi Stanislav,
I have recently been playing around with Dual language Document Assistant and am wondering if there is a function to "clean" the bilingual tables, so that they end up just the translation. That is, ready to be sent to the client. The same applies to the second option, "Convert selected paragraphs to dual language text."
Michael
I have recently been playing around with Dual language Document Assistant and am wondering if there is a function to "clean" the bilingual tables, so that they end up just the translation. That is, ready to be sent to the client. The same applies to the second option, "Convert selected paragraphs to dual language text."
Michael
Please Log in or Create an account to join the conversation.
2 years 8 months ago #143
by stanislav
Replied by stanislav on topic Dual language Document Assistant: 'clean' bilingual table?
Hi Michael!
Can you tell me how your dual-language document was created? Was it created manually by someone, or was it created by Dual Language Document Assistant and then translated in a CAT tool?
There is no tool in TransTools for this, but you can accomplish this relatively quickly using the following process:
1) A dual-language table can be converted to single-language text like this: in each table, delete one column, then convert table to text using Word's built-in command. If there are multiple tables, you can do it quicker like this: highlight all tables in a color (to differentiate them from non-dual-language tables), then delete one column from one table and do the same for the other tables by putting the selection in the redundant column and pressing F4 (repeats last action). Finally, select the first table, go to Table Tools > Layout tab and click Convert To Text button. Paragraph Marks should be selected in the options dialog, click OK to convert the table to text. For the other tables, put selection point anywhere in a table and press F4 to convert to text. When you are done, you may need to adjust indents and other paragraph formatting. So, only 2 passes through the document, it should take a minute or two depending on the number of dual-language tables.
2) Dual-language paragraphs can be converted to single-language text using TransTools+ Multiple Find & Replace tool ( www.translatortools.net/products/transto...-multiplefindreplace ). But this works only if both languages are inside the same paragraphs, i.e. the delimiter is a slash, forward slash, line break, and not a paragraph break. Suppose the delimiter is /, then you need to create a regex search expression to find text before or after the slash and replace it with nothing.
If you want to remove text before the slash:
Type: Regular expression
Find What: ^[^\/]+\/\s*
Replace with: nothing
If you want to remove text after the slash:
Type: Regular expression
Find What: \s*\/[^\/]+$
Replace with: nothing
In Multiple Replace tool, add one of these search expressions to the list, then click Find All. This will give you a list of matches. Make sure that each match represents the text you want to remove, as it is possible that the source document contained a slash and so the dual-language document still has that slash in a paragraph in addition to the delimiting slash. Also, it is common in dual-language documents not to make certain parts dual-language, and such paragraphs may also contain a slash. So, you can quickly go through the list of matches with arrow keys and uncheck the paragraphs you don't want to replace using Space key. When done, click Replace Selected button to remove the second language.
If you previously highlighted the tables in a color, the color may help you differentiate the already converted text from dual-language paragraphs that still require conversion.
Best regards,
Stanislav
Can you tell me how your dual-language document was created? Was it created manually by someone, or was it created by Dual Language Document Assistant and then translated in a CAT tool?
There is no tool in TransTools for this, but you can accomplish this relatively quickly using the following process:
1) A dual-language table can be converted to single-language text like this: in each table, delete one column, then convert table to text using Word's built-in command. If there are multiple tables, you can do it quicker like this: highlight all tables in a color (to differentiate them from non-dual-language tables), then delete one column from one table and do the same for the other tables by putting the selection in the redundant column and pressing F4 (repeats last action). Finally, select the first table, go to Table Tools > Layout tab and click Convert To Text button. Paragraph Marks should be selected in the options dialog, click OK to convert the table to text. For the other tables, put selection point anywhere in a table and press F4 to convert to text. When you are done, you may need to adjust indents and other paragraph formatting. So, only 2 passes through the document, it should take a minute or two depending on the number of dual-language tables.
2) Dual-language paragraphs can be converted to single-language text using TransTools+ Multiple Find & Replace tool ( www.translatortools.net/products/transto...-multiplefindreplace ). But this works only if both languages are inside the same paragraphs, i.e. the delimiter is a slash, forward slash, line break, and not a paragraph break. Suppose the delimiter is /, then you need to create a regex search expression to find text before or after the slash and replace it with nothing.
If you want to remove text before the slash:
Type: Regular expression
Find What: ^[^\/]+\/\s*
Replace with: nothing
If you want to remove text after the slash:
Type: Regular expression
Find What: \s*\/[^\/]+$
Replace with: nothing
In Multiple Replace tool, add one of these search expressions to the list, then click Find All. This will give you a list of matches. Make sure that each match represents the text you want to remove, as it is possible that the source document contained a slash and so the dual-language document still has that slash in a paragraph in addition to the delimiting slash. Also, it is common in dual-language documents not to make certain parts dual-language, and such paragraphs may also contain a slash. So, you can quickly go through the list of matches with arrow keys and uncheck the paragraphs you don't want to replace using Space key. When done, click Replace Selected button to remove the second language.
If you previously highlighted the tables in a color, the color may help you differentiate the already converted text from dual-language paragraphs that still require conversion.
Best regards,
Stanislav
Please Log in or Create an account to join the conversation.
- michaelbeijer
-
Topic Author
- Offline
- Новый участник
-
2 years 8 months ago #144
by michaelbeijer
Replied by michaelbeijer on topic Dual language Document Assistant: 'clean' bilingual table?
The dual-language document was created by me, using Dual Language Document Assistant.
I'm playing around with some ideas where I translate the file inside the table created by Dual Language Document Assistant, i.e. in MS Word, and then afterwards 'clean' the table to generate a target doc for the client.
Am just messing around as I am very impressed with Window 11's new Voice Access feature (only available in the dev channel at the moment), which lead me to wondering if there would be a way to make working in Word doable. The reason for this is while Voice Access's dictation works in e.g. Studio and memoQ, none of its voice commands do. They only work inside a select few programs like MS Word, Notepad, etc.
re Voice Access, see e.g.: www.knowbrainer.com/forums/forum/message...tid=9&threadid=36171
I am playing around with various ideas around recreating basic CAT tool functionality (mainly: grid layout, translation memory lookup/saving, glossary lookup/saving) inside MS Word, so I could benefit from full Voice Access functionality.
One idea I had for the TM functionality is to use Trados T-Window for Clipboard ( community.rws.com/product-groups/trados-...%20to%20be%20running .).
Or maybe stuff from Wordfast Classic: TM and/or glossaries.
Or GT4T as MT provider as well as use it's "Simple Glossaries", which can be used for both lookup/saving, and even to modify MT output.
Or use LogiTerm, which I own, but never really use fully, which can also translate inside MS Word and do all manner of TM/TB lookup stuff.
As you can see, lots of ideas! Basically just messing around (as usual).
Michael
I'm playing around with some ideas where I translate the file inside the table created by Dual Language Document Assistant, i.e. in MS Word, and then afterwards 'clean' the table to generate a target doc for the client.
Am just messing around as I am very impressed with Window 11's new Voice Access feature (only available in the dev channel at the moment), which lead me to wondering if there would be a way to make working in Word doable. The reason for this is while Voice Access's dictation works in e.g. Studio and memoQ, none of its voice commands do. They only work inside a select few programs like MS Word, Notepad, etc.
re Voice Access, see e.g.: www.knowbrainer.com/forums/forum/message...tid=9&threadid=36171
I am playing around with various ideas around recreating basic CAT tool functionality (mainly: grid layout, translation memory lookup/saving, glossary lookup/saving) inside MS Word, so I could benefit from full Voice Access functionality.
One idea I had for the TM functionality is to use Trados T-Window for Clipboard ( community.rws.com/product-groups/trados-...%20to%20be%20running .).
Or maybe stuff from Wordfast Classic: TM and/or glossaries.
Or GT4T as MT provider as well as use it's "Simple Glossaries", which can be used for both lookup/saving, and even to modify MT output.
Or use LogiTerm, which I own, but never really use fully, which can also translate inside MS Word and do all manner of TM/TB lookup stuff.
As you can see, lots of ideas! Basically just messing around (as usual).
Michael
Please Log in or Create an account to join the conversation.
2 years 8 months ago #145
by stanislav
Replied by stanislav on topic Dual language Document Assistant: 'clean' bilingual table?
Great! Good luck with finding the best workflow for this. I wonder how Voice Access interfaces with a particular Windows application. The answer could tell us whether there is any hope that Studio, memoQ and other common CAT tools will implement this. By the way, did you try bilingual formats from memoQ or Studio?
Please Log in or Create an account to join the conversation.
- michaelbeijer
-
Topic Author
- Offline
- Новый участник
-
2 years 8 months ago #146
by michaelbeijer
Replied by michaelbeijer on topic Dual language Document Assistant: 'clean' bilingual table?
Yes, I have also been playing around with memoQ and Studio bilingual formats!
I suspect Voice Access will never work properly with Studio or memoQ, but I may be wrong. However, I will definitely be asking the developers of both tools to work on it, as this would be a fantastic feature to add to a CAT tool: not just the ability to dictate flowing text, but also to select and edit individual words, and all kinds of other voice-powered commands (confirm segment, run concordance, add terms, etc.).
I can actually already do most of this with VoiceMacro ( www.voicemacro.net/ ), but the dictation quality of Voice Access is better than the engine use in VoiceMacro (which is still using the old Microsoft dictation engine).
I suspect Voice Access will never work properly with Studio or memoQ, but I may be wrong. However, I will definitely be asking the developers of both tools to work on it, as this would be a fantastic feature to add to a CAT tool: not just the ability to dictate flowing text, but also to select and edit individual words, and all kinds of other voice-powered commands (confirm segment, run concordance, add terms, etc.).
I can actually already do most of this with VoiceMacro ( www.voicemacro.net/ ), but the dictation quality of Voice Access is better than the engine use in VoiceMacro (which is still using the old Microsoft dictation engine).
Please Log in or Create an account to join the conversation.