Questions tagged [cjk]

CJK stands for Chinese, Japanese and Korean and is used to label issues common to these East Asian languages and their large character repertoires.

Filter by
Sorted by
Tagged with
0
votes
0answers
19 views

Why the phrase can't be searched in my Sphinx?

I have installed python documentation generator-sphinx,and post a some articles in it,recently i post an article named Ʊɸѡ. Input ɸѡ can get it. Input the whole name of the article Ʊɸѡ in the ...
0
votes
0answers
11 views

Japanese Offline Text Recognition on iOS - Options?

For my project, I am looking currently for some way to recognize text on a given picture. The challenge is - the text to be recognized is in Japanese. So this means, the distinction between Hiragana/...
0
votes
0answers
9 views

pdfmake is it possible to programmatically remove or skip 'tofu' characters

I asked a question previously about how to render foreign language characters using the pdfmake module in NodeJS. An important part of the problem is that there are many lines of text that contain ...
0
votes
1answer
23 views

pdfmake install custom fonts on server side for CJK, arabic, and other foreign languages

I am using PDFMake to generate PDFs on the sever-side with NodeJS 12. The PDFs are rendering text that has a mix of english and foreign language characters. The PDFs are working, however, none of the ...
0
votes
1answer
24 views

Are Asian fonts broken in iOS 13?

We're building an iPad app with Qt 5.12.6 that supports Japanese in its UI. For several releases, switching the device to Japanese has worked, our text displays fine. But with iOS 13, most of the ...
3
votes
2answers
32 views

Kana input with Japanese 106/109 key keyboard in Ubuntu 18.04

I have a Japanese 108-key keyboard that I have been using for a long time on en-US Windows 7. It works correctly including keys that switch alpha/hiragana/katakana etc. Recently I installed Ubuntu ...
0
votes
0answers
47 views

How to make English text appear the same size as Japanese text?

My website includes both English and Japanese characters. The problem is that although they are defined in the same class, they look as if they are different sizes in certain environments. For example,...
0
votes
0answers
23 views

How to set the body of an email to a string containing multibyte characters?

When I try to modify the body of a parsed email message, the resulting message is encoded wrongly. This is a minimal example: raw_message = '''From: [email protected] To: [email protected] Subject: ...
0
votes
2answers
21 views

How to interpret bytes for UTF-8 encoded Hiragana?

I have a string "Ϥ" and I'm trying to understand how it's represented as bytes. Number.prototype.toBits = function () { let str = this.toString(2); return str.padStart(8, "0"); } let ja = "...
0
votes
0answers
23 views

R: Tesseract OCR engine language breaks

I am trying to read Korean language by using Tesseract OCR engine. I am using the image below, but when I run my code, I get weird result. library(tesseract) file_input <- "path/to/image.jpg" ...
1
vote
0answers
23 views

How can I display korean in moviepy?

My code: #-- coding:utf-8 -- from moviepy.editor import * myclip = VideoFileClip("55.mp4").subclip(10,40) t = u'55 '.encode('utf-8') txtclip = TextClip(t, fontsize=50, color='red', font='...
1
vote
1answer
35 views

How to sorting list of string combine Japanese and Latin in python

I have a problem with sorting in python. Can any body help me! please. Thanks a lot! I want sorting list follow like sorting in excel List original: table = [ u"Ů~ʧ", # 2 u"", # 3 ...
2
votes
0answers
23 views

Ignore Japanese Characters for ICU4J transliterations

Is there a way to ignore the all the Japanese characters for Transliteration using the ICU4J library? What are the IDs of the transliterators that participate in the Japanese Transliteration? The ...
1
vote
0answers
142 views

SQLite/AutoHotkey, I have problem with Encoding of sqlite3_result_text return function

I am writing a User Define Function with SQLite in AutoHotkey. It works well as I intended when I use (return) English only. But, If I use (return) any character with NonEnglish, it makes broken ...
3
votes
0answers
47 views

How do I input Chinese text into VBA Editor in Excel 2010?

I need to insert Chinese characters into the Properties of VBA editor. I am using Windows 10 and Excel 2010. I have just re-installed Simplified Chinese language pack into the Control Panel>Region>...
2
votes
1answer
38 views

How to create a SELECT query for each record in a table?

I have a table, SingleCrossReference containing a list of words in Japanese. I want to query this table, and for each record, count how many times that word string appears in a separate table, Keyword....
-1
votes
2answers
50 views

Find duplicates in case-sensitive query in MS Access

I have a table containing Japanese text, in which I believe that there are some duplicate rows. I want to write a SELECT query that returns all duplicate rows. So I tried running the following query ...
1
vote
1answer
18 views

MS Access Query does not differentiate hiragana and katakana with standard equality operator

I recently ran into a problem with an MS Access query where I was searching a table containing Japanese text. Japanese has two alphabets, hiragana and katakana, with the same sound value, but ...
1
vote
0answers
29 views

hreflang for China Taiwan zh-Hans zh-Hant or zh-cn zh-tw

Hello thanks for any input, ok the Hreflang, should I use the zh-Hans zh-Hant or zh-cn zh-tw as seen below, cheers <a href="https://www.websitename*^%$.com/zh-hans/index.php" rel="alternate" ...
0
votes
0answers
28 views

Latex define Chinese font style for header

I am trying to a few include Chinese words in my document header, using the package \usepackage{CJKutf8}, the following gives me a pretty good Chinese font in text but I wasn't able to incorporate it ...
0
votes
0answers
16 views

TCPDF kozminproregular font with CJK values working in Windows10 but not in Windows8

For some reason, PDFs that are viewed in Windows 8 does not show CJK (Chinese, Japanese, Korean) characters correctly. However, when they are viewed in Windows 10 all characters displays fine. I've ...
1
vote
1answer
43 views

How do I append sentiment values from oseti to a pandas dataframe?

First post here! After struggling with mecab and encodings I got oseti to work for Japanese sentiment analysis, where oseti.Analyzer() takes a string and prints a list with one value per sentence: &...
0
votes
1answer
41 views

JavaScript/NodeJS RTF CJK Conversions

I'm working on a node module that parses RTF files and does some find and replace. I have already come up with a solution for special characters expressed in escaped unicode here, but have ran into a ...
0
votes
1answer
35 views

How to copy and rename files which names are in Chinese with VBA

I have the path of the files stored in MS Access in a table. The table was made to rename a specific set of files, so the important fields are oldpath and newpath. These are used in VBA. First I bring ...
0
votes
0answers
22 views

UDF function in impala recieved chinese character change to?

UDF works in hive, but not in impala. work in hive chinese character changed to ?? in impala I make a new UDF to print byte for input String public class GetBytes extends UDF { public String ...
0
votes
1answer
32 views

Chinese in Japanese encoding

This may sound like a stupid question. I typed some Chinese characters into an empty text file in VS code text editor (default utf8). Then I saved the file in an encoding for Japanese: shift JIS, ...
1
vote
2answers
57 views

Japanese Email Webfont

I've been working on an email template with Japanese characters and annoyed for days about this issue. The problem is, my desired output for font rendering on an email template is not achieved. I'm ...
1
vote
2answers
117 views

How to add Pinyin diacritics programmatically in Google Apps Script?

I would like to write a function dia, which adds pynyin diacritics to provided characters, so that, forexample dia('a', 1) == '' dia('a', 2) == '' dia('a', 3) == '' dia('a', 4) == '' and I don't ...
0
votes
0answers
35 views

Google NotoSansCJKtc-Regular.ttf contains gujarati ,hindi,urdu letters but it didnt appear in PDF

I have converted NotoSansCJKsc-Regular.otf to NotoSansCJKsc-Regular.ttf. As per Guideline on https://www.google.com/get/noto/help/cjk/ "Each font sets one language as the default. Note that each ...
0
votes
0answers
21 views

How to import csv file with Chinese character in R?

I tried read.csv("data.csv", encoding="UTF-8") but it does not work. Image description:
0
votes
0answers
29 views

What key event is executed for Japanese Hiragana space key?

In my software I'm doing something like: private void Panel_KeyDown(object sender, System.Windows.Input.KeyEventArgs e) { if (e.Key == Key.Return) { ...
1
vote
2answers
123 views

How do I filter out invisible characters without affecting Japanese character set?

I noticed that some of my input is getting U+2028. I don't know what this is, but how can I prevent this with consideration of UTF-8 and English/Japanese characters?
2
votes
2answers
47 views

How to search number full size japanese in couchbase

I'm getting error when try full text search number full size "" in couchbase 6.0.3. Exception throws : err: bleve: QueryBleve validating request, err: parse error: error parsing number: strconv....
1
vote
1answer
55 views

How can I query in Japanese on Rails active record?

I want to query in Japanese on Ruby on Rails. In my current code, it is not working and only returns an empty set. The database, MYSQL 8.0 is running on docker with the default setting. Should I ...
2
votes
0answers
31 views

how to detect japanese word using google vision with horizontal line or using TEXT_DETECTION to Detect text in files (PDF)

I using google vision to detect document text with PDF file but i have some trouble width result of response. result of response is great but some symbols detect with vertical line. I know japan ...
0
votes
0answers
36 views

“Run selection” and “Source” give different results when runnning R scripts containing Chinese

The following codes give different results in Rstudio when "run selection" and "source" library(dplyr) gender <- c("Ů", "Ů", "Ů", "", "", "") gender2 <- recode(gender, ""="M", "Ů"="F") ...
0
votes
0answers
21 views

How to index mixed Alphanumeric and Japanese in ElasticSearch

I have an ElasticSearch index that is currently using the ICU tokenizer with cjk width. I can successfully search for Japanese terms. The client has terms such as DRӋ or in cases alpha characters ...
0
votes
0answers
10 views

How can Wandy display Chinese?

Some pictures include Chinese, but they can not be displayed by Wandy properly. What should I do? Any encoding or decoding would work? A picture includes Chinese
2
votes
0answers
43 views

Filtering criteria for rows that only contain certain Chinese characters

Suppose table_a consists of a column of approximately 1,500 unique Chinese characters and table_b consists of a column of approximately 50,000 unique Chinese character combinations (multi-character ...
0
votes
0answers
75 views

failed to compile Haskell project “helloworld”

I am a beginner of Haskell and I have downloaded ghc-8.8.1 from the official site. I tried to compile my first Haskell program with it and failed. Here is my program helloworld.hs: main = putStrLn "...
0
votes
0answers
59 views

SwiftUI TextField with Japanese writing

I have a question. When I am using Textfield in SwiftUI with Japanese character, the size of the text is really different. Also when I use TextField to input text like Japanese name, it's really ...
0
votes
0answers
18 views

How to avoid word token when covert a dataframe into corpus?

I try to convert a dataframe into a corpus for Chinese materials. I have utilized JiebaR to split and tokenlized the text and then use the order corpus1 = corpus(dataframe) While after this process, ...
0
votes
0answers
24 views

When International characters are written to HttpServletResponse in java using PrintWriter, it is displayed as 你好

When we write a chinese content which is got as response from a REST call to HttpServletResponse using PrintWriter as below, then, the issue occurs. In the servlet when we use the below code, it gets ...
1
vote
0answers
88 views

Javascript: event.preventDefault() does not work for Japanese IME

I want to create a text input which does not allowed to input any character (same as disabled input, but the mouse cursor still shown) function loadPage() { const el = document.getElementById('...
-1
votes
1answer
27 views

how to show Chinese characters in python?

PYTHON: trying to catch content from a Chinese website, but the return shows no Chinese characters. How to solve this problem? code: import urllib.request doc="http://data.eastmoney.com/cjsj/...
0
votes
1answer
47 views

I would like to know a way to convert following unicode string to modern Japanese characters

I've been attempting to create a Convolutional network that involves me using the Kuzushiji dataset to convert ancient kuzushiji documents into modern Japanese. I was just looking for a way to decode ...
1
vote
2answers
38 views

How to print Chinese, Japanese and Korean strings in Codeigniter

I'm working on a PHP site that needs both Chinese, Japanese and Korean. , but I can't get the characters to display if I print a string like this: views/index.php echo "...
0
votes
2answers
43 views

Scapping Traditional Chinese with BeautifulSoup4: output file cannot display Chinese characters

This is the page I'm trying to scrape: https://zh.wikisource.org/wiki/%E8%AE%80%E9%80%9A%E9%91%92%E8%AB%96/%E5%8D%B701 The page is encoded in UTF-8. Here is my code: import requests as r from bs4 ...
0
votes
0answers
21 views

Stop a ruby annotation being wider than the text it is associated with

This HTML: <div> <ruby>ߥå<rt>ceramic&nbsp;</rt></ruby> <ruby><rt>comparison<br>contrast<br></rt></ruby> <span>Ǥϡ</...
3
votes
1answer
50 views

Which encoding? Character strings enclosed by tilde ~ and curly braces {}

The APNIC Whois database contains a lot of entries for Chinese entities with some sort of encoding enclosed by ~{ and ~}. For example: $ whois 211.68.92.0 | grep ^descr: descr: ~{146{J5QiJR;...