Patch. Remove unused keys from en.txt

Programmers discuss here anything related to FreeOrion programming. Primarily for the developers to discuss.

Moderator: Committer

Message
Author
User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13603
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Patch. Remove unused keys from en.txt

#16 Post by Geoff the Medio »

Cjkjvfnby wrote:[[Enzyklopädie STEALTH_TITLE]] should be [[encyclopedia STEALTH_TITLE]]?
Yes.
DESC_VAR_LAUNCHRATE
[[MEDIDOR_RATIO_LANZAMIENTO]]
Stringtables can reference other entries in themselves, even if they don't appear in code or other content scripts. If that string is present elsewhere, this might make sense.
How should we work with duplicate (fi.txt)?

Code: Select all

OPTIONS_DB_UI_STATE_BUTTON_COLOR
Sets UI state button selected color. 

OPTIONS_DB_UI_STATE_BUTTON_COLOR
Sets UI state button color.
Remove duplicates. This might indicate a missing string, though.
is space before key matters (ru.txt)?

Code: Select all

 ORDER_FLEET_SCRAP
Уничтожить корабли
Possibly, but it probably shouldn't be there regardless.
Does space after value matters? (aaa:<space> == '''aaa: ''')
Yes, especially if enclosed in quotes.

User avatar
Cjkjvfnby
AI Contributor
Posts: 539
Joined: Tue Jun 24, 2014 9:55 pm

Re: Patch. Remove unused keys from en.txt

#17 Post by Cjkjvfnby »

fix varios minor issues.

- remove key "FoodAllocationForMaxGrowth". it is not present in en.txt
- fix Enzyklopädie
- fix spaces before keys
- remove duplicate field (fi.txt)

If key is missing in de.txt it will be shown form en.txt or show error?
Attachments

[The extension patch has been deactivated and can no longer be displayed.]

If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13603
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Patch. Remove unused keys from en.txt

#18 Post by Geoff the Medio »

Cjkjvfnby wrote:If key is missing in de.txt it will be shown form en.txt or show error?
The GUI should use the default stringtable (en.txt) contents if the specified stringtable is missing a key.
Cjkjvfnby wrote:fix varios minor issues.
Committed.

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Patch. Remove unused keys from en.txt

#19 Post by Dilvish »

Geoff the Medio wrote:
Cjkjvfnby wrote:DESC_VAR_LAUNCHRATE
[[MEDIDOR_RATIO_LANZAMIENTO]]
Stringtables can reference other entries in themselves, even if they don't appear in code or other content scripts. If that string is present elsewhere, this might make sense.
Although it might make sense for a translator to make their own stringtable entries and then refer to them, what Cjkjvfnby has pointed out here is a little different. The english stringtable entry is

Code: Select all

DESC_VAR_LAUNCHRATE
[[METER_LAUNCH_RATE]]
The preferred practice would be to simply translate the entry for METER_LAUNCH_RATE, which has been done in the spanish stringtable as

Code: Select all

METER_LAUNCH_RATE
Cadencia de Lanzamiento
The problem is that in the DESC_VAR_LAUNCHRATE someone apparently not understanding how the lookups work also translated key "METER_LAUNCH_RATE" into "MEDIDOR_RATIO_LANZAMIENTO", but did not make any stringtable entry for MEDIDOR_RATIO_LANZAMIENTO so it will create a stringtable error. Similar changes have been made in many places in the spanish stringtable, which will probably make it a larger headache for whoever next tries to update it. That particular problem can wait until someone is willing to tackle the job of cleaning up the spanish stringtable, but perhaps we should add a big warning explaining the mistake at the top of the stringtable.
Geoff the Medio wrote:
Cjkjvfnby wrote:If key is missing in de.txt it will be shown form en.txt or show error?
The GUI should use the default stringtable (en.txt) contents if the specified stringtable is missing a key.
That's one of the issues I have a personal to-do about following up on -- our recent experience with the german lookup for STAR_GROUP_CHARS has made me concerned about whether this is working properly.
Cjkjvfnby wrote:One more question about spaces, STAR_NONE value is single space, is this is correct value (fr.txt)?

Code: Select all

STAR_BLACK
Trou noir
STAR_NONE
 
INVALID_STAR_TYPE
Type d'étoile inconnu
The english entry is

Code: Select all

STAR_NONE
no star
and I would think all stringtables should have an actual translation for that rather than a simple space.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

mileser
Space Squid
Posts: 57
Joined: Thu May 15, 2014 2:32 pm

Re: Patch. Remove unused keys from en.txt

#20 Post by mileser »

Cjkjvfnby wrote:One more question about spaces, STAR_NONE value is single space, is this is correct value (fr.txt)?

Code: Select all

STAR_BLACK
Trou noir
STAR_NONE
 
INVALID_STAR_TYPE
Type d'étoile inconnu
The english entry is

Code: Select all

STAR_NONE
no star
and I would think all stringtables should have an actual translation for that rather than a simple space.
So, would that be:

STAR_NONE
Aucune étoile
Last edited by Dilvish on Wed Aug 20, 2014 11:19 pm, edited 1 time in total.
Reason: added missing [quote] to fixe multilevel quote
OS: OS X 10.10 Yosemite, XCode 6.01
Also: If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Cjkjvfnby
AI Contributor
Posts: 539
Joined: Tue Jun 24, 2014 9:55 pm

Re: Patch. Remove unused keys from en.txt

#21 Post by Cjkjvfnby »

Dilvish wrote:The problem is that in the DESC_VAR_LAUNCHRATE someone apparently not understanding how the lookups work also translated key "METER_LAUNCH_RATE" into "MEDIDOR_RATIO_LANZAMIENTO", but did not make any stringtable entry for MEDIDOR_RATIO_LANZAMIENTO so it will create a stringtable error. Similar changes have been made in many places in the spanish stringtable, which will probably make it a larger headache for whoever next tries to update it. That particular problem can wait until someone is willing to tackle the job of cleaning up the spanish stringtable, but perhaps we should add a big warning explaining the mistake at the top of the stringtable.
Can I just remove them from string tables or just post them in spain thread (There is 41 broken reference)?
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Patch. Remove unused keys from en.txt

#22 Post by Dilvish »

Cjkjvfnby wrote:Can I just remove them from string tables or just post them in spain thread (There is 41 broken reference)?
I would suggest posting them in an ES stringtable thread (looks like it would be a new thread) but not removing them, which would seem a bit more like hiding a problem rather than solving/improving it.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Cjkjvfnby
AI Contributor
Posts: 539
Joined: Tue Jun 24, 2014 9:55 pm

Re: Patch. Remove unused keys from en.txt

#23 Post by Cjkjvfnby »

Dilvish wrote:
Cjkjvfnby wrote:Can I just remove them from string tables or just post them in spain thread (There is 41 broken reference)?
I would suggest posting them in an ES stringtable thread (looks like it would be a new thread) but not removing them, which would seem a bit more like hiding a problem rather than solving/improving it.
It is in my plans to introduce script that recreates language file based on en.txt structure. In that case all this values will be restored to english values. But before I want to remove part of unused keys viewtopic.php?p=71748#p71748

PS. Can I ask you to replace '\n\n\n' to '\n\n' (double blank lines to single) for en.txt and de.txt (It will slightly reduce diff after my script works)
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Patch. Remove unused keys from en.txt

#24 Post by Dilvish »

Cjkjvfnby wrote:It is in my plans to introduce script that recreates language file based on en.txt structure. In that case all this values will be restored to english values. But before I want to remove part of unused keys viewtopic.php?p=71748#p71748
I suggest you be sure to coordinate with adrian broher about any further work with translations and the stringtables, since he recently indicated he is going to be working on an overhaul.

This 'remove unused keys' project you've taken on has a big weakness/problem, though -- it's a huge headache to actually review enough to have reasonable confidence that the keys can be removed. Evidently this headache is too big for you even, because it looks to me like this latest version of the 'to remove' file, still hasn't been vetted very well. I see a great many HOTKEY_X entries, for example, and when I grep for HOTKEY_ it looks like these keys are being automatically created by Hotkey::UserStringForHotkey() in UI/Hotkeys.cpp. And that makes me lose appetite for checking about the remaining DESC_VAR_X and DESC_VALUE_X entries, let alone all the other entries. The only way this could really have a chance of working is if you explained for each entry, or each group of related entries, just what steps you took to conclude they were not in use, so that we could review your process and not have to simply do the entire job ourselves in order to assess the result you present. I can understand the sentiment of wanting to remove unused keys, but this is looking like far more headache than it is worth.
PS. Can I ask you to replace '\n\n\n' to '\n\n' (double blank lines to single) for en.txt and de.txt (It will slightly reduce diff after my script works)
My initial reaction is that the extra readability from having some double blank lines is likely well worth the slightly larger diff after your script works, or slight bit of extra complication in having your script look out for double blank lines -- that's something you could just scan for in a preprocessing step isn't it?
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Cjkjvfnby
AI Contributor
Posts: 539
Joined: Tue Jun 24, 2014 9:55 pm

Re: Patch. Remove unused keys from en.txt

#25 Post by Cjkjvfnby »

Dilvish wrote: I see a great many HOTKEY_X entries, for example, and when I grep for HOTKEY_ it looks like these keys are being automatically created by Hotkey::UserStringForHotkey() in UI/Hotkeys.cpp.
That was my question. Give me rules how to understand if key is used by system. How can I find out that this hotkeys is not outdated?

May be someone can point keys that really can be removed, and we just remove them?

As result of this thread I expect two things: Remove outdated keys and create document how keys are produced.
My initial reaction is that the extra readability from having some double blank lines is likely well worth the slightly larger diff after your script works, or slight bit of extra complication in having your script look out for double blank lines -- that's something you could just scan for in a preprocessing step isn't it?
[/quote]

I use en.txt as template and just write it line by line to new file, replacing values with values present in current translation. When blank lines in files does not match if will produce diff line.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Patch. Remove unused keys from en.txt

#26 Post by Dilvish »

Cjkjvfnby wrote:That was my question. Give me rules how to understand if key is used by system. How can I find out that this hotkeys is not outdated?...
May be someone can point keys that really can be removed, and we just remove them?
I think it is fairly clear now that no one has it definitively in their memory just what keys can be removed. The path we've been on for determining which can be removed would be an iterative process -- doing the steps you've taken so far, reviewing the code for suspicious repeats like I did with "_LABEL" and "DESC_VAR_" before and "DESC_VAL_" and "HOTKEY_" now to see if the keys are being automatically determined from other tokens in some way, and then following up on what's revealed in the code to figure out the set of those tokens, then removing all the now-known-to-be-ok keys from the list, and again reviewing for any remaining suspicious repeats, etc. Finally some playtesting would need to be done.

Another, probably more reliable approach, would be to review the code for every occurrence of UserString("X") for fixed keys and every occurrence of UserString( something_more_complicated ) for not-necessarily-fixed strings and then closely examining the generation of all those not-necessarily-fixed strings to see whence they come. I would think that you probably have enough programming background to manage that even if you aren't trained in C++ (and now is a good time to learn :D )
As result of this thread I expect two things: Remove outdated keys and create document how keys are produced.
These are laudable and valuable goals, but to me the time cost is appearing to be on the high side relative to the benefits. It's making me think I need to hurry up and finish reviewing/committing your AI python cleanup patches from a couple weeks ago so that you can get refocused on more of that :D This is also reminding me of the Debian contribution social guidelines I recently encountered, I am really thinking we should have our own similar guidelines or perhaps even link to theirs. As I have already pointed out, you've already received a fair bit of help with this and you seem to be reaching the limits of how much effort you are willing to put into this sub-project. Rather than asking explicitly if anyone else is motivated by this goal enough to contribute more their time to helping you do these remaining parts (which frankly seem to me to be the most difficult/tedious parts), you instead seem to be treating it as if it would be a simple thing for someone to waltz in and finish this and so you sort of seem to be repeatedly saying that you expect someone to do just that. I think you need to recognize that this would be a nontrivial task for anyone. You've taken on a lot a helpful tasks for the FO project and I don't at all want to discourage you in general, but I think you need to look hard at your approach to this particular task.
I use en.txt as template and just write it line by line to new file, replacing values with values present in current translation. When blank lines in files does not match if will produce diff line.
I'm not understanding the problem with dealing with multiple blank lines, and on hearing of your plan here I'll further encourage you to be sure to coordinate with adrian. I expect he'll chime in here when he has the chance, but I recommend you pm him to be sure he knows this is a topic worth him checking in on, the boards have been quite busy lately.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
adrian_broher
Programmer
Posts: 1156
Joined: Fri Mar 01, 2013 9:52 am
Location: Germany

Re: Patch. Remove unused keys from en.txt

#27 Post by adrian_broher »

Dilvish wrote:I suggest you be sure to coordinate with adrian broher about any further work with translations and the stringtables, since he recently indicated he is going to be working on an overhaul.
While this is true there are still disagreements if it should be done. Geoff doesn't like the idea of transitioning to gettext (I would vote for the boost::locale implementation but the gettext workflow is what the translator needs to handle in the end). [1][2]
Dilvish wrote:I can understand the sentiment of wanting to remove unused keys, but this is looking like far more headache than it is worth.
I came to the same conclusion when working with the translations. In fact I had a script that should do the same as the one what Cjkjvfnby. But in the end I realize this is a waste of effort to reinvent a square wheel. Those thing are solved better with the gettext toolchain and other utilities around the po file format.

[1] viewtopic.php?p=70143#p70143
[2] viewtopic.php?p=63426#p63426
Resident code gremlin
Attached patches are released under GPL 2.0 or later.
Git author: Marcel Metz

User avatar
Cjkjvfnby
AI Contributor
Posts: 539
Joined: Tue Jun 24, 2014 9:55 pm

Re: Patch. Remove unused keys from en.txt

#28 Post by Cjkjvfnby »

adrian_broher wrote:But in the end I realize this is a waste of effort to reinvent a square wheel. Those thing are solved better with the gettext toolchain and other utilities around the po file format.
I have some experience with gettext and it is real great.

See my thread about Transifex viewtopic.php?f=27&t=9025. This can help to separate translators from coders.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Cjkjvfnby
AI Contributor
Posts: 539
Joined: Tue Jun 24, 2014 9:55 pm

Re: Patch. Remove unused keys from en.txt

#29 Post by Cjkjvfnby »

Lets finish.

This is summary and I want to put to the wiki.
May be it is not truth but it is how I understand it.

Code: Select all

Where from keys for translations come?

1) from code wrapped to UserString (both in C++ and python)
2) from scripting files
3) auto-generated form tokens.h (entries converted to uppercase and suffixed with DESCR_VAR_, not all entries should have key in translations)
4) generated from existing key (by adding DESCR_VAR_ suffix and LABEL_ prefix)
5) Hotkeys autogenerated somehow

How maintain translation?

en.txt is base file, all other should have some number of keys.
to add key find appropriate place for it.

File syntax and style:

- file consists from key and value pares
- value written on next line after key
- single line value can be written with out any quotes
- multi line value should be wrapped to triple single quotes
- use # for comments
- use [[key]] to reference key in same file.
- use [[prefix key]] to create special link for that key in encyclopedia
- there should not be duplicated keys. (one key will be ignored, this will lead to confusion)
- each key should be written in upper case with _ word separator
- if line ends or starts with space, it is better to wrap it in triple single quotes
- use no blank line between related keys
- use blank line between key groups
- separate block by 3-line comment header and subblocks by single line comment
- use double blank line before block
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13603
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Patch. Remove unused keys from en.txt

#30 Post by Geoff the Medio »

Via the [[SUBSTITUTION]] mechanism, a key may also exist only in the stringtable, in order to be referenced in other strings, and not be derived from anything in a script file or source code.

Such a key can exist in one translation but not another, as long as it is only referenced within translations where it is present (or referenced in a non-english translation if it appears in the english stringtable).

Post Reply