Pharaoh + Cleopatra

Pharaoh + Cleopatra

Not enough ratings
Pharaoh - Problem of Diacritics
By Telariust
wrong height and mixed/incorrect;
   
Award
Favorite
Favorited
Unfavorite
Welcome
(Review in the beta stage, I'm improving)

Chief Advisor, Telariust, welcomes you!



For FR, DE, IT, ES, SE, PL, PT, CN, KR, TH.. and other

Dear international readers!
Use GoogleChrome to automatically translate the article.


Briefly, short, essencial
Using EN .EXE with custom translation guarantees problem of incorrect display of diacritics.

1) Diacritics are displayed with wrong Height;
Everyone (FR/DE/IT/ES/SE/PL) suffers from this problem when they try to play on EN Pharaoh.exe
What does the problem look like


Why is this happening

For "a-Z 0-9" EN/FR
- top alignment;

For unicode (diacritics) EN/FR
- up to 12/17 pixels high, top alignment;
- more than 12/17 pixels, bottom alignment (from the middle of the line);

Solutions
  • Use FR Resizer;
  • Use Font with fixed alignment; (not ready yet)
  • Use patch "FR border fix" to EN Resizer; (not ready yet)

I'm not sure if I can create a patch "FR border fix" in a reasonable amount of time.
If you have the desire and opportunity - to do it before me - I will only be glad.
In fact, the whole cycle of my articles is to encourage others to take action.
I can't handle everything.

2) Diacritics are mixed/incorrect;


Translation is bound by Font via (magic) "text encoding" String to Pharaoh.exe
Using any convenient HEX-editor, can find the "text encoding" String in the native .EXE and transfer it to the new .EXE

This works for the entire series of games Caesar3 + Pharaoh + Zeus + Emperor.
This way you can easily change Translations + Fonts.

The EN and the FR/DE/IT/ES/SE fonts are almost the same and only have minor height adjustments of a few letters (by 1-2 pixels).
Therefore they all use the same EN "text encoding" String in .EXE
Therefore, this problem occurs only for some (PL/RU/..)


Solution on example of PL
Can I run in [polish] language?

Polish translation can be found here
megawrzuta . pl/download/592fcd711f144eec70ed9a5fb70522f3.html
Install by copy-paste method with replacement.
Files "Pharaoh_Fonts.555" and "Pharaoh_Fonts.sg3" need to be replaced in the "/Data" folder (which is not obvious)



To get started - turn off the auto-update of the Game in Steam so that your changes are not canceled.
(Does this old game have an auto-update system on Steam?)

Using custom PL translation guarantees the problem of incorrect display of PL diacritics.
(If you use EN Pharaoh.exe)

Ironically, the Poles suffer the most from diacritics.
(GOG is a Polish company, it owns the rights to distribute the Game)

1) Diacritics are displayed with wrong Height;
Everyone (FR/DE/IT/ES/SE/PL) suffers from this problem when they try to play on EN Pharaoh.exe

Why does everyone play EN?
- Steam/GOG sell only the EN version;
- best widescreen Resizer is only available for EN;

Why can FR do what EN can't?

For "a-Z 0-9" EN/FR
- top alignment;

For unicode (diacritics) EN/FR
- up to 12/17 pixels high, top alignment;
- more than 12/17 pixels, bottom alignment (from the middle of the line);

Solutions
  • Can find code in FR .EXE that correct border middle of line for diacritics;
    When this code is found and transferred to EN Resizer, it will be possible to get rid of the binding to the outdated FR Resizer.
    I'm working on it, but for now use FR.
  • Fix EN/PL Font to Height alignment (which is not always possible);
  • Create Font with Height alignment;
    For example, create a bottom-aligned Font with letters of the same height.
    Letters without tails below line level, like hieroglyphs and cuneiform (as in Zeus).

Widescreen FR Resizer (FR Pharaoh.exe v2.0) (16 Jul 2016, Crudelios)
www.moddb.com/mods/pharaoh-resizer-full-hd-enfr/downloads/pharaoh-resizer-for-french
Unfortunately, the FR Resizer has not been updated, unlike the EN Resizer.
- Pharaoh.exe v2.0 instead of v2.1;
pharaoh.heavengames.com/downloads/showfile.php?fileid=1103
(points with the logic of actions)
- defect in displaying the World Map;
- if resolution is 4K (more than 1920x*), then Game crashes when opening the World Map;


2) Diacritics are mixed/incorrect;
Translation is bound by Font via (magic) "text encoding" String to Pharaoh.exe
The EN and the FR/DE/IT/ES/SE fonts are almost the same and only have minor height adjustments of a few letters (by 1-2 pixels).
Therefore they all use the same EN "text encoding" String in Pharaoh.exe.

But in PL Font - 556 out of 1340 images are partially or completely changed.
This is about 41%!
In fact, this is already a different Font, so it requires its own "text encoding" String in Pharaoh.exe.
PL The "text encoding" string differs slightly from EN, which keeps the text readable despite defects.

There is such a ready one (2013, by JackFuste)
https://web.archive.org/web/20190910011209/www.wsgf.org/phpBB3/viewtopic.php?f=64&t=14149&start=200#p149187
This is "PL Pharaoh.exe v2.1 GOLD 2008" with resolutions of 1920x1080 and 1920x1200.
He has a defect on the World Map.
But fixed some interface bugs, typical for early EN ready-made kits (2012, by JackFuste).
In fact, it was a prototype on the way to create EN and FR Resizer (2016, by Crudelios)
(it is almost unique, there is only RU 7Wolf Pharaoh.exe 1920x1080 and 1360x768 like it)
I tried to change the screen resolutions for it, but it's just a prototype!
I can’t find some of the parameters, they are not yet in .EXE 2012 but are no longer in .EXE 2016.
And it also has a game crash if you set it to 4K.
Therefore, we only need it as a source of PL "text encoding" String.
Offset for Pharaoh ... 1E049C - v2.1 EN 1E051C - v2.1 PL 29EE44 - v2.0 FR ... PL String from "PL Pharaoh.exe v2.1 GOLD 2008" 240 bytes (224+16bytes "555 556 565 655 ") Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F 00000000 00 3F 40 00 00 41 00 4A 43 44 42 46 4E 45 4F 4D ?@ A JCDBFNEOM 00000010 3E 35 36 37 38 39 3A 3B 3C 3D 48 49 00 47 00 4B >56789:;<=HI G K 00000020 00 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 !"#$%&'() 00000030 2A 2B 2C 2D 2E 2F 30 31 32 33 34 00 63 00 00 50 *+,-./01234 c P 00000040 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00000050 10 11 12 13 14 15 16 17 18 19 1A 00 00 00 00 00 00000060 65 61 56 54 51 53 01 67 81 55 57 59 73 5D 69 76 eaVTQS g UWYs]iv 00000070 6A 67 6D 60 5D 5F 64 63 19 7B 6B 00 54 00 00 57 jgm`]_dc {k T W 00000080 52 7F 5E 75 66 70 01 0F 80 00 00 00 00 00 00 77 R ^ufp Ђ w 00000090 82 00 00 56 00 00 00 00 00 51 00 00 00 00 00 58 ‚ V Q X 000000A0 72 70 71 71 69 83 72 65 74 6A 71 73 77 75 76 76 rpqqiѓretjqswuvv 000000B0 00 74 7A 78 79 79 7B 00 84 7E 7C 7D 6B 33 00 68 tzxyy{ „~|}k3 h 000000C0 53 53 54 51 51 85 53 65 57 56 52 55 5B 5A 5C 59 SSTQQ…SeWVRU[Z\Y 000000D0 00 55 5F 59 60 60 5D 00 86 63 62 64 61 19 00 19 U_Y``] †cbda 000000E0 35 35 35 00 35 35 36 00 35 36 35 00 36 35 35 00 555 556 565 655

Open widescreen Pharaoh.exe in any HEX-editor.
Go to offset 29EE44 (for FR Pharaoh.exe v2.0) and replace to PL String (240 bytes).

For another .EXE version, Offset will be different, but it can be easily found by "555 556 565 655 ".
This works for the entire series of games Caesar3 + Pharaoh + Zeus + Emperor.
This way you can easily change Translations + Fonts.

UPDATED
Using any convenient HEX-editor, you can find the "text encoding" String in the native .EXE and transfer it to the new .EXE;
Or even you can use "certutil" built into Win7,Win8,Win10;
Press "Win+R" or Start =>Search =>cmd.exe (maybe, Run as Admin)
Go to the game folder
cd /d "D:/games/Pharaoh"
Create HEX-dump Pharaoh.exe
certutil -f -encodeHex Pharaoh.exe HEX.txt >nul
Open HEX.txt in a regular notepad, quickly look for the offset thanks to "555 556 565 655 ", and change 240 HEX cells to new ones.
(the rightmost column has no effect, it's just a visual representation of HEX=>ASCII so that you at least roughly understand what's in the HEX cells)
Save in notepad and run the inverse transformation.
certutil -f -decodeHex HEX.txt Pharaoh_new.exe >nul
Start the Game with a new file and check!

Here "PL String" is the "text encoding" String from "PL Pharaoh.exe v2.1 GOLD 2008"



There are at least 2 native PL .EXEs for PL

Here "PL v2.0 244b" is a mutant of unknown origin, which has a "text encoding" string of 244 bytes (against the usual 224 bytes)
(if you count by adding +16 bytes "555 556 565 655" at the end, then 260 against the usual 240)
This is a unique case (although scouts report that there is supposedly an .EXE with an anomalous String for Zeus RU by Triada)



Bug in PL Font
There is an extra letter #1296, it shouldn't exist at all.
It shifts the list of letters and they get messed up after it.

But this is the very last subFont, it is either not used or it is "shadow".
SubFonts through one are a shadow for the previous ones.
(when you hover the mouse and the text gets darker)
Perhaps this bug fell on such a "shadow" subFont.


Detailed Research
Font research, facts and hypotheses

Recent tests have shown that the type of a letter has no effect on alignment.
type indicates compression only



About EN Font

The EN Font contains 200 images (basic, system.bmp) and 1340 images (letters, fonts.bmp).
Letters can be roughly grouped as 10 subFonts (as bitmap_id), each with 134 images.
SubFont consists of 80 "a-Z 0-9" images and 54 unicode (diacritics) images.
Often a subFont is followed by exactly the same subFont, slightly darker.
This is a shadow subfont so that the text becomes darker on hover.

By the way, the letters "a" in each subFont has type=256 (unlike all other letters with type=0)
Transparency was added to Pharaoh and compression based on it, probably they wanted to demonstrate on "a".



About diacritical problem
See how the display of al letters differs for EN and FR .EXE


Note, letter "ç" (0xE7) (ascii=231) has unique status and is displayed correctly even in EN .EXE.



The first subFont, small, is used for tooltips.
Almost none of the Publishers changed it and used it as is.
Its study gives an understanding of the causes of problems.
Each letter is Top-aligned and stored as type=276 (black-white font)
However, letters alignments 1-80 ("a-Z 0-9") differ from 81-134 (diacritics) by 3 pixels.
Maybe that the letters 81-134 were added to the subFont without checking.
But when the problem escalated, it was faster to insert code into the .EXE than to fix the Font.
I believe this is where the problem with different letter heights originated.



The second subFont, medium, is used in the Menu.
"a-Z 0-9"
unicode (diacritics)

Letters 135-214 ("a-Z 0-9") are Top aligned and stored as type=0 (uncompress)
Letters 215-268 (diacritics) are Bottom aligned and stored as type=0 (uncompress)

As it turned out, FR/DE/IT/ES/SE fonts are almost the same as EN.
For several diacritical letters, the height is adjusted by only 1 pixel.
Therefore, when transferring translation files, everything will work (even if you forgot about the Font).

Also, the average level for 215-268 is 2 pixels lower than for 135-214.
Why fixed it by 1 pixel when need it by 2 pixels?..
And if on the Bottom, And if on the Top - need 2 pixels...
(later I figured out that exactly 1pixel is needed in FR .EXE to align between "a-Z 0-9" and diacritics)



You see the wrong Height of diacritical letters in the Game.
When I created this image, I realized that I completely can not understand how the alignment works.
Here you can see how the factors contradict each other.
Only one thing became clear - everything is complicated and there is a certain formula that changes the alignment behavior.
(this may be the calculation of the arithmetic mean and something else)



Here you can see that the alignment Height depends on the ASCII code of the letter.



C3Modder bugs prevented me from experimenting with changing letter size (see other section).
So for a while I just researched already existing fonts from different languages.

This is how I discovered the exclusivity of Thai font and RU (Fargus) font.
All letters in them have type=256 (compress).
The alignment isn't perfect (from the middle of the line), but at least the letters don't cross the ceiling.

Because of this test, I thought for a long time that alignment depends on type (but it doesn't).



I finally managed to replace a few letters by changing their Height (and Width).
Finally it became clear how it works. This combination is unthinkable.
For type=0
For type=256
type=0 and type=256 no difference?
It looks like type=256 as bottom-alignment is nothing more than an illusion caused by the chosen height of the letters!
That is, the substitution of the type does not make any sense.

For "a-Z 0-9" EN/FR
- top alignment;

For unicode (diacritics) EN/FR
- up to 12/17 pixels high, top alignment;
- more than 12/17 pixels, bottom alignment (from the middle of the line);

Counting from the middle of the line. Why is it like this?
It was 1999, the birth of standards (including fonts).
I assume that diacritical characters were originally supposed to consist only of dots/accents/tails, which were supposed to be displayed above the usual a-Z.



About alignment
Here you can see that the diacritical letters are replaced in the Font RU (Fargus) with russian letters.
But why is Letter Height normal?
And because type=256 (Bottom) and the same Height of letters and up to 12 pixels high;
You can notice that the Font is slightly raised up (2 pixels up from the middle of the line).
Because you can add letters with tails to a Font with bottom alignment only by raising the overall level. Such a sacrifice.

But what happens if you do NOT raise the overall level when moving the Font with tails to the bottom alignment.

Fortunately, the letter RU "ф" (as EN "f") is rarely found in words, and the letter RU "p" in the translation is replaced by EN "p".



About alignment
- same Height of letters;
- Height up to 12 pixels high (top alignment);
- explicit type=256 (Bottom)
The Font RU (Fargus) is fully aligned and behaves the same on EN and FR .EXE
The Font Thai is not Height-aligned, so it has a spread on EN and FR .EXE




About the Font Thai
Unfortunately, FR .EXE does not solve the diacritics problem for TH (Thai)
FR solves the Height problem, but for Thai there is also the Width problem.
Can this be added to widescreen EN v2.1?.. I have no idea.

By the way, this is what the correct Thai looks like. Yes! There +1 item in the Main Menu!

Thai(TH), Korean(KR), Simplified Chinese(CN) - EN letters can be substituted.

Chinese Traditional requires BIG5 support.
(I have seen such modified .EXEs. Of course they are not compatible and not portable)



About the Font RU (1C)
Alignment for unicode changed as "a-Z 0-9" into RU .EXE.
Therefore, there is a problem of incompatibility of Font RU (1C) with EN .EXE.

Will have to recreate the font by changing the type, otherwise it will not work on widescreen EN.


(working notes)
The problem of jumping height of diacritics in European languages.
(à ì ò ù, é, ô, Ä, d, Ö, ö, Ü, ü, ß ...)
(for Font RU 1C this affects uppercase handwritten letters with lower tails дзруфцщ)
Assumes existence in .EXE of Table of "adjustment of position" of letters.
It is planned to search the table by HEX codes for unicode characters;
Two pairs of .EXEs of equal versions and nearly the same size were found, both showing and NOT showing the problem.
Several days of their comparative analysis led to the understanding that the table either does not exist at all or is not presented as I originally imagined (resource, structure)
Replacing different pieces of code between .EXEs did not lead to either a fix for the problem or its appearance.
An analysis of the Fonts in various Editions suggests that the table was not originally planned, but is a crutch that corrects the problems of the Fonts.

It's not a Table, it's a Cycle (binary? deployed by the compiler?).
Adjustment loop for border alignment (middle of line?) of ASCII group of unicode letters.
In EN .EXE, the ASCII group is limited to around "a-Z 0-9" (unicode letters cut out).
In FR .EXE, the ASCII group of unicode letters and middle of line.
in PL .EXE, mics EN+FR;
Letter "ç" (0xE7) (ascii=231) has unique status and is displayed correctly even in EN .EXE.


That is, EN from FR may differ in the value of only one byte (the height of the middle of the string).

Until Cycle is found:
- use FR .EXE;
- fix Font to Height alignment (which is not always possible);
- create Font with Height alignment;



Anomalous Font detected, Zeus translation by RU Triada.
The Height of each letter is 50.

Perhaps too high a letter height switches the alignment in the .EXE to Top.
Or in the Font there is a threshold adjustment through unknown flags.
This discovery could be key.

There are unknown flags in the Font that may well be responsible for the type of alignment...

*.sg3 bitmap
+65 filename;
+51 comment;
+4 width;
+4 height;
+4 num_images;
+4 start_index;
+4 end_index;
+4 unknown (between start & end);
+16 unknown (4x4);
+8 unknown (2x4 width & height);
+12 unknown (3x4 internal image);
+24 unknown;


*.555 images
64bytes per image
+4 offset;
+4 length;
+4 uncompressed_length;
+4 zero bytes;
+4 invert_offset;
+2 width;
+2 height;
+26 unknown;
+2 type;
+4 flags/option-like (flags[0] is .extern_flag);
+1 bitmap_id;
+3 unknown;
+4 zero bytes;

(if 72bytes per image)
+4 (alpha_offset, For D6 and up SG3)
+4 (alpha_length, For D6 and up SG3)


How edit Font
"SGReader" for reading textures and "C3Modder" for writing textures

*.555 stores pictures and animation frames.
*.sg3 stores ID, offset, length, width, height, type, ..

Moved to separate
https://steamcommunity.com/sharedfiles/filedetails/?id=2770842807


Download Fonts
Planned release
1) Pharaoh EN A font that won't require more than FR .EXE for FR/DE/IT/ES/SE;
2) Pharaoh PL A font that won't require more FR .EXE for PL;
3) Pharaoh RU (1C) Font that will work correctly on EN .EXE;
4) Emperor RU (ZOG) Font (already forgot what was wrong with him);
5) Large fonts Caesar3 + Pharaoh + Zeus + Emperor for widescreen .EXE.



Large Fonts for WideScreen

They are not here. Yes, bummer.
Because they are nowhere, they do not yet exist.
But they are needed. Everyone. 10 years already.
(since the widescreen patches came out)
And with the help of SGReader + C3Modder you can make them.
Are you savvy?

It is possible to increase the font size to x1.5 without breaking lines.

You can completely switch to capital letters only.
Letters of the same height are more consistent with ancient writing, since (hier)glyphs and cuneiform have the same character height (with rare exceptions).

For good, of course, this problem needs to be solved by increasing the scaling of the entire interface, as is done in the Caesar3 Julius clone.
But the complexity of creating such code is comparable to or exceeds the complexity of creating widescreen patches.


Goodbye

Originally posted by "RU_ORIGIN":
- Знаешь, моя любимая часть в изучении иностранных языков - это возможность перенять себе эти новые клевые диакритические знаки, типа таких!

- Прекрати. Некоторых из этих букв даже не существует.
Originally posted by "RU_to_EN":
- You know my favorite part of learning foreign languages is an opportunity adopt these new cool diacritics like that!

- Stop it. Some of these letters does not exist.
Originally posted by "EN_to_Diacritics":
- ʮøü Ҝñøᾧ ṁў ḟàѷør̥їțé ṗàr̥ț öḟ łéàr̥ñїñģ ḟør̥éїģñ łàñģüàģéš їs àñ øṗṗør̥țüñїțў àḋøṗṫ țḥéšé ñéᾧ çøøł ḋїàçr̥їțїçš łїҜé ṫḥàț!

- Stop it. Some of these letters does not exist.
[translate.google] CORSICAN (DETERMINED AUTOMATICALLY)

Corsican? Ok, thx :D