Producing a glossary with no repetitions from a file with many repetitions

Forum: SDL Trados support
Topic: Producing a glossary with no repetitions from a file with many repetitions
Poster: Huw Watkins

Hi Guys,

Some help with how SDL Trados Studio (2009) works under the hood would be very helpful here. I have a very large excel file that has now been fully translated. The issue I now have is that the agency has requested a glossary to be compiled from this file, but with no repetitions.

The original file was 300,000 words with 60,000 no match and a number of fuzzies. There are over 200,000 repetitions and I am using a fresh TM, so no TM matches.

Given the complexity of the task at hand, the agency has kindly agreed that I compile the glossary on a segment by segment basis (not word by word - or the next 10 years of my life would be written off(!!) or I'd have to buy some sort of term extraction tool, which is not going to happen).

My plan of action is this:

1) Recreate the project with another fresh TM.
2) Select the Export unknown segments option in Analyze Files settings
and Possibly:
3) Select the Export frequent Segments option in Analyze Files settings.
4) Process the project as normal and use the export files from 2 and possibly 3 for my glossary.

My doubt comes in step 3. Thus far my experiment has involved me doing steps 1 and 2 and producing an unknown segments file that contains solely the no match words. It doesn't contain the fuzzies however (this is based on looking at the analysis of the file exported during the batch processing).

If I repeat the process but including step three will there be any duplications with the no match words. Do the no match words actually count the first occurrence of segment that is repeated numerous times throughout a file? Is this the same for fuzzies?

Am I running the danger of having repetitions if I use both the unknown segment file and frequent segment file included in the final glossary (bear in mind that I want the fuzzies to appear, but not the reps - there are no 100% matches which makes things easier...)

My next question is this - I am finding that I am not able to export the unknown segments file to excel (the original format of the original file) - does anyone know how to solve this? I have attempt a good old fashioned copy and paste into excel with all the target segments and that seems to work, thankfully(!!!), but I'm curious to know if I can save the target file in excel or not.

Any other tips on how I should approach this?

[Edited at 2013-07-26 14:31 GMT]

Producing a glossary with no repetitions from a file with many repetitions

Trending Articles

Black Angus Grilled Artichokes

ThrowBack Music: OD4 “Odo Nkomo” Feat Ras Nega

Watch Naagin Blockbuster 8th December 2019 Online Full Show HD Video at 6 PM

St. George couple continues racking up theft-related criminal charges

Practice Sheet of Right form of verbs for HSC Students

Man dies and another in serious condition after A614 crash between Driffield...

UGK & The Affiliates Discography

Named and shamed: a round up of cases heard by Essex magistrates

Elle Duncan’s Husband Omar Abdul Ali

David Suarez Arrested by Miami-Dade County Corrections on Dec 24, 2019

Moondru Mudichu 20-07-2016 – Polimer tv Serial

Neem Baba Extra Questions Answer Class 6 English Poorvi

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

La Joaqui – Para Dos (feat. Lali) – Single [iTunes Plus M4A]

Download: Pontiano Kaiche- Kungumbo e,kwesu

Student ordered to do community service

The 10 Tennessee Cities With The Largest Black Population For 2021

The Crack Era In Philadelphia Revisited: Philly Junior Black Mafia Timeline...

meth, burglary and other arrests in Attala/Leake County

Chappell Roan – The Subway – Single [iTunes Plus M4A]