Write eastern language Persian/Farsi/Arabic

Dear @uleysky, many thanks for your consideration. OK i will try to make a list of all important characters and ligatures and prepare also a more complex sentence for tomorrow. :slight_smile:

@uleysky, i’ve selected all required Persian letters (157 Unicode points) + numbers (10 Unicode points). Please check selected.dat. I’ve also checked the ligatures.py and i think we don’t need any of them since they are just text beautifier. So the total number of Unicode points are 167 which is far from 256. The data.csv file now contains a complex string including both Persian letter and number + English letters.
data.dat (157 Bytes)
selected.dat (3.7 KB)

I am using the following python script

#!/usr/bin/python
import arabic_reshaper
from bidi.algorithm import get_display

reshaped_text = arabic_reshaper.reshape(u'اولین نقشه با برنامه GMT که فونت پارسی را به درستی نمایش می دهد. نفشه شماره ۱ و ۲ و ۵')
bidi_text = get_display(reshaped_text)
print(bidi_text.encode('raw_unicode_escape'))

but something seems went wrong. The text after the “GMT” is before it and vice versa.
Here, please ex31.pdf (72.3 KB)

1 Like

In the PDF everything is OK, except one character (character "و" is replaced by "م" in word "فونت"). Please use the following to be sure all characters being red correctly.

from pandas import read_csv
import arabic_reshaper
from bidi.algorithm import get_display

db = read_csv("data.dat", names=["lon","lat","text","v"])

reshaped_text = arabic_reshaper.reshape(db.text.values[0])
bidi_text = get_display(reshaped_text)
print(bidi_text.encode('raw_unicode_escape'))
#print(bidi_text)

Same. Perhaps a bug in the arabic_reshaper?

But when we use print function (inside the python code) the output is correct! I inserted print in the last line to check it and it was correct. Strange!

My mistake. Mapped code point U+FEEE on the same glyph as U+FEE3. Here is correct result ex31.pdf (48.1 KB)

1 Like

Beside Persian, the arabic_reshaper also supports Urdu, Pashto and Arabic languages, and there may be several volunteers for some of identical characters. That’s why I separated the Persian Unicode points in selected.dat file.

Yes, now all Persian characters are correct, but the word “GMT” is replaced by squares! :slight_smile:

Demonstration of another problem, there is no Latin characters in the Arabic font (NotoSansArabic-Regular), so you have to switch the font (to NotoSans-Regular). It’s good that at least this can be done by means of the GMT (@%NotoSans-Regular%GMT@%%). ex31.pdf (72.4 KB)

I think this problem is solvable if we use any font which has both Persian and English character set right?

The problem is solved anyway, font switching works fine as you can see.

Okay, now I have a rough idea of what needs to be done. I’ll start writing code to automatically generate a sequence of characters for the GMT from a Unicode string. Now this is done manually, you have seen the results with errors. I don’t know if I will have time during the work week, but I plan to continue on the weekend.

Perfect.
Please let me know if i can do anything. Thanks again for the valuable time you’re spending. :slight_smile:

Here is some results: Farsi.zip (629.2 KB)

  1. Encoding file, farsi.txt, contains unicode code points for language characters. Four forms of 32 letters and some additional characters taken from wikipedia Persian alphabet - Wikipedia. Quite convenient when the language has 32 letters )
  2. Font encoding vector generator, two files, queryfont.cpp and mktable. Creates a font-specific postscript code for the GMT. Also creates PSL_custom_fonts.txt.
  3. Translator of Arabic Unicode into a format suitable for the GMT. Two parts, generator of sed commands, gensed, and run-time translator ar.py. This is a bunch of crap code, I hope you rewrite it yourself normally on Python. Embedding it in a script also is a pain.
  4. Test page generator, test/testtable. Creates a table where you can see how the font matches the encoding, what characters are there, what are missing.
  5. You test example, test.sh and data.csv. Please note that you have to switch the font in the data.csv, since your font does not contain Latin letters. GMT sequences is a problem for bidirectional text (

Everything seems to work, but we may need to add additional symbols to the farsi.txt.

We will also need to create a localization file to draw ticks on the axes correctly.

Dear @uleysky, thanks again for the codes you’ve provided. seems it worked perfectly. i have a problem when running the test.sh, with compiling queryfont. the following is the terminal output.

(base) saeed@saeed-P453UJ:~/Downloads/Compressed/Farsi$ ./test.sh
/usr/bin/ld: /tmp/ccEJYVAs.o: in function main': queryfont.cpp:(.text+0xac): undefined reference to FT_Init_FreeType’
/usr/bin/ld: queryfont.cpp:(.text+0x119): undefined reference to FT_New_Face' /usr/bin/ld: queryfont.cpp:(.text+0x18a): undefined reference to FT_Select_Charmap’
/usr/bin/ld: queryfont.cpp:(.text+0x1e4): undefined reference to FT_Get_First_Char' /usr/bin/ld: queryfont.cpp:(.text+0x21a): undefined reference to FT_Get_Glyph_Name’
/usr/bin/ld: queryfont.cpp:(.text+0x233): undefined reference to FT_Face_GetVariantsOfChar' /usr/bin/ld: queryfont.cpp:(.text+0x38e): undefined reference to FT_Get_Next_Char’
/usr/bin/ld: queryfont.cpp:(.text+0x3a9): undefined reference to FT_Done_Face' /usr/bin/ld: queryfont.cpp:(.text+0x3b8): undefined reference to FT_Done_FreeType’
collect2: error: ld returned 1 exit status
./mktable: line 79: ./queryfont: No such file or directory
./mktable: line 79: ./queryfont: No such file or directory
./test.sh: line 63: gawk: command not found
./test.sh: line 18: gawk: command not found
./test.sh: line 25: gawk: command not found
psconvert [ERROR]: The file /home/saeed/.gmt/sessions/gmt6.14377/gmt_0.ps- has no BoundingBox in the first 20 lines or last 256 bytes. Use -A option.
rm: cannot remove ‘queryfont’: No such file or directory

any idea what is going wrong on my Ubuntu 20.04?

apt install libfreetype-dev, possibly?

sudo apt-get install libfreetype6 libfreetype6-dev libfreetype-dev but still not working :thinking:

Indeed, it does not work in Ubuntu. The solution is as simple as it is idiotic (often in Ubuntu): instead of
g++ -o queryfont pkg-config --cflags --libs freetype2 queryfont.cpp
write
g++ queryfont.cpp -o queryfont pkg-config --cflags --libs freetype2
In Gentoo works any variant.

Yes, that is working now. I will do some tests and then inform you about the results. :slight_smile:

Dear @uleysky, i’ve tested the code with several fonts and variety text strings:
1- All is doing well now, especially when using a double language supported font (simultaneous Farsi and English character). To see its functionality please see the font IRANSans.ttf, so there is no need to specify the font inside text string and it also solved the ticks annotation issue.
2- There are only minor issues with some special characters like (? % …) which despite we have them in farsi.txt file, but they are shown in Unicode form, please see results.zip (111.9 KB)