Questions tagged [cjk]

CJK stands for Chinese, Japanese and Korean and is used to label issues common to these East Asian languages and their large character repertoires.

Filter by
Sorted by
Tagged with
0
votes
0answers
13 views

How can I scrape content from a website that's in Simplified Chinese?

I have tested this code on various English language websites with no problem. However, when I tried to scrape content from a website that's in Simplified Chinese, the data appeared as gibberish in the ...
0
votes
2answers
33 views

How to change language of Oracle Client 12c Release 2

I have installed Oracle Client 12c Release 2 on my windows. I want to change its language to Japanese, not English. When I have some errors, I would like it shows error in Japanese. I have set Region, ...
1
vote
0answers
32 views

RxSwift breaks Japanese - Romaji input

Add the Japanese - Romaji keyboard to your device. Settings > Keyboards > Add New Keyboard When typing on that textfield if you try to type tada you won't get the correct input but d. Any ideas ...
4
votes
1answer
61 views

How to make beautiful line breaks in Japanese?

I have a website in English and Japanese. English is displayed perfectly. There are problems with hyphenation in Japanese. Sometimes hanging 1-2 characters remain on a new line. I want to manage ...
1
vote
0answers
26 views

C#: SSML syntax for Chinese pinyin using Microsoft Azure Text to Speech API

I have developed an C# Windows apps uisng Azure "Text to Speech" API but for Chinese, I have not figured out how to insert pinyin strings in SSML format that the Azure API can interpret. I'm aware ...
0
votes
0answers
18 views

Cleaning of Chinese text data [duplicate]

I have a Chinese text corpora. Other than Chinese chars, it also has 1) English chars 2) space-like chars 3) Chinese punctuations, e.g., , etc 4) digits, e.g., 1.23 I'd like keep all the ...
0
votes
1answer
38 views

I have a large Chinese text file and I want to reformat it into individual lines, each ending with a period

I want to separate this file into lines (each ending with a period (question mark, exclamation point, etc)) in order to make it easier to work with later on. I attempted to use nltk, but to no avail: ...
1
vote
1answer
43 views

Kivy Text Input: Chinese characters

When I select input language as Chinese(pinyin) and try to type only English letters is displayed and no suggestions about transforming them into Chinese characters is showed. Is there any way to ...
1
vote
1answer
38 views

Displaying japanese characters with PHP

I have plain Japanese hieroglyphs texts with utf8mb_general_ci in MySQL table, I can fetch row and display as a single string. But what I need is to get a single character from string and use it for a ...
2
votes
0answers
30 views

Spring MVC Controller parsing emojis as CJK chars

I am trying to post emojis in a simple form. This is received in the controller as And I cannot find the reason for this. Im using spring 3.1.4. I have added in the web.xml the following: <...
2
votes
1answer
41 views

Pandas DataFrame: Replace based on filter and regex extract

Here's a section of my dataframe: Type Date Diff Data 0 Section 20171204 1.0 ~ 1 Korean 20171204 1.0 . 2 English 20171204 1.0 Im Yooyang. 3 Theme ...
-1
votes
1answer
20 views

How to change the language of quest? (AzerothCore)

I use azerothcore docker build my private server, and use Chinese client to login successfully. The problem is the quest text, item names etc. are in English not Chinese. Is it possible to change ...
1
vote
0answers
35 views

How to fix error problem when i import excel to postgresql?

I am trying to import an excel file with some Japanese data in it. I tried solutions that use COPY, use pgadmin import but I always end with an error message: C:/Users/O53/Desktop/Book4.csv : ...
3
votes
0answers
83 views

Why when I use select statement like “select * from table_name where character='' ” then return '','','' and ''

Mysql query returns mismatched records, such as the query field character='أ' but matches out, , , ? I update my mysql from 5.7 to 8.0,in 5.7this problem is more common drop database if exists `...
0
votes
0answers
11 views

How to segment chinese paragraph into sentences using jieba?

I want to segment Chinese paragraph into sentences. What is the best way to do it? I found jieba library, but its not helpful in the sense that I am not able to get proper sentences but only words as ...
1
vote
1answer
29 views

Chinese character comparison returning false when it should return true

I am performing simple string comparison between two Chinese characters which are both properly decoded (I think) from UTF-8, however, the results are still non-equal and I haven't been able to figure ...
0
votes
1answer
28 views

How do browsers deal with “Tofu” characters

character. I am using the Orbitron font in a hybrid Cordova/Android app that I am creating - quite simply because it is compact and has the clean, futuristic look that I am after. However, I ...
0
votes
0answers
24 views

Anybody used “lookup-tables” in rasa when using Chinese language

I am learning rasa_nlu.In my case ,I need to use "lookup-tables" in rasa.I have run the demo successly from rasa_lookup_demo.But where I change the training data to Chinese and use pipline like ...
0
votes
1answer
48 views

Japanese characters are not showing (blank)

I want to show a select box which shows language types as their own languages. For example, Korean will be displayed as . The other languages are OK, but Japanese characters are showing as blank ...
1
vote
0answers
26 views

Render japanese font in ggplot

I am creating a R shiny web app, which I have to publish in Japanese. The application plots many graphs, mainly using ggplot2. After I run my application on my local machine, the browser renders all ...
0
votes
0answers
17 views

How to find exact word in Japanese language text using regex in php? [duplicate]

I've Japanese input text. For instance, consider the following: 'ҤǤ' (English version - Simon is happy) And, I'm looking for sub string - 'ҤǤ' (English version - is happy). How can I find ...
1
vote
2answers
691 views

How to match cjk characters with sed?

I'd like to match CJK characters. But the following regex [[:alpha:]]\+ does not work. Does anybody know to match CJK characters? $ echo ' a b' | sed -e 's/\([[:alpha:]]\+\)/x\1/g' xa xb The ...
0
votes
2answers
81 views

How detail with cjk character correctly for webpage?

I am not able to see the cjk characters correctly. It seems that it is mistaken as in ISO-8859 encode. I think the UTF-8 encode is the appropriate one. Does anybody know how to fix the problem. $ ...
0
votes
0answers
29 views

Special Chinese character to Big 5 fail

In my Java 7 project I am trying to convert Unicode Chinese characters to Big5 and it is working fine for many Chinese characters. I am using below code to convert to utf8 string to Big5 String. ...
0
votes
1answer
48 views

Color words ending with a digit?

I'm working on an application that could help learners of Chinese memorize word pronunciation. I'm curious if it's possible to implement a Pinyin syllabel highlighter in pure CSS. For example, given ...
1
vote
0answers
31 views

SSIS - Loading Japanese Double byte data from DB2 to SQL Server 2017 is giving conversion error

I am trying import data from DB2 which has double byte Japanese(1027) data into SQL Server 2017 using SSIS Tried using data conversion before inserting data "932 (ANSI/OEM - Japanese Shift- JIS)". ...
1
vote
0answers
14 views

Report Data from Amazon MWS showing Japanese character as question mark

I am fetching Inventory report from Amazon MWS API and all is working good but Japanese character from the report is showing as a question mark like (i oiID oiSKU) I did header with ...
0
votes
0answers
22 views

Rejected Builder Variable for Looped Chinese Text Stimuli

I'm running a experiment where I present speakers with text stimuli in Chinese characters and record their production of the sentence. The spreadsheet I'm using for the loop over the trials thus has ...
1
vote
0answers
36 views

XML containing Japanese loads as NULL with MySQL

I'm using Workbench and MySQL to load an XML file into a database. The XML file (which I didn't create) has tags inside of tags, some of which are in Japanese. I'm not sure if I'm dealing with UTF-8 ...
0
votes
0answers
21 views

Speech Korean to English

Speech from my understanding is not just phrases. It is running speech. Korean speech cannot be translated, only short phrases. Is there a way to convert Korean recorded speech to Korean Hangul text? ...
0
votes
1answer
55 views

Japanese character in AdaptiveCard Bot Framework V4

I have been trying to print a simple card with a Japanese character but it keeps displaying boxes and unknown characters. This is how I create my adaptive card, then I get the params and data in a ...
0
votes
2answers
62 views

What the Chinese, Japanese, and Korean characters are in Unicode

From what I've gathered: Hiragana is U+3040 to U+309F Katakana is U+30A0 to U+30FF. U+4E00..U+9FFF is part of the complete [Chinese] set, but not all. The exact ranges for Chinese ...
2
votes
2answers
66 views

RegEx for capturing Korean alphabets

My data frame in name is like below: '(340)', '(8)', '(7)', '(222)', '', '', '', '(214)', '', '', '', '(212)', '(7)', '(317)', '(341)', '', '', ...
0
votes
0answers
29 views

Need to convert hiragana and kanji alphabet to katakana in the specific input feild in form

I have to make a registration form in Japanese and need to convert name field into katakana letter when user type in hiragana or kanji character.This field for specially for type their name in ...
0
votes
1answer
31 views

How to disable space when using pandoc with Chinese characters?

When we use vim, we always set vim to limit the number of characters per line. Like this set cc=80 set fo=+tMn So if I convert a markdown file to a docx file, pandoc will automatically place a space ...
1
vote
0answers
30 views

How to enable chinese input method for qt5 app in docker

I have built a qt5 program in ubuntu docker, but i can't change input method as sougo for input chinese characters. My docker is support chinese characters because i can copy chinese characters to qt ...
0
votes
0answers
73 views

How to change the output language of speech recognition

This code is working but I'm only able to listen to englsih output voice, I would like to change the output voice language to Chinese . the code can recognize Chinese but the output is not Chinese, ...
1
vote
1answer
75 views

Japanese datetime format after 1/5/2019

I need to format DateTime value to string in Japanese. The problem is: After 1/5/2019 Japan has a new king, so it must be Ԫ0501. But when I use my code, the result is ƽ310501 public static ...
0
votes
1answer
38 views

How to prevent line breaks in Japanese text

I have html code <tr> <th> ڄ_ʼϣ </th> <td> <input maxlength="250" type="text"> </td> </tr> i want it inline: ڄ_ʼϣ but ...
0
votes
0answers
42 views

VBScripts using filestream getting ramdom text

I have a small application the make HTML file that save Japaneses text on it. I want to save Japanese text in the file but when I run the file I get different set of text. This is the code my TEST ...
1
vote
1answer
33 views

Multilingual URL changes to weird characters after launching app via URL scheme

I am to using url scheme provided by iOS to launch my app from external source like from website. For example I have created a HTML file which has code as below. The query string contains some ...
0
votes
0answers
15 views

VS Code Japanese Input on Ubuntu

Currently using latest Ubuntu LTS, and I got Japanese input working for all applications EXCEPT VS Code. Despite having the keyboard switched to JP (mozc) hiragana input, it comes up as English only, ...
3
votes
2answers
126 views

How to Format English Date in Japanese with ERA

I want the new Japanese ERA Date as "R010501", whereas I am getting "R151". I am using the com.ibm.icu.text.DateFormat package to get the date format Date dtEngDate = new SimpleDateFormat("yyyy-MM-...
1
vote
1answer
56 views

Fullcalendar - Remove day suffix in dayGridMonth View when using Japanese

When setting locale to Japanese in Fullcalendar and use dayGridMonth View, for each day cell suffix ""(means day) is added. I want to remove this day suffix letter, so that the appearance of the ...
0
votes
0answers
33 views

Japanese Decode in Python - UTF-8 Problem

I have a python code that takes notes saved in Gmail and convert them from bytes to UTF-8. But It doesn't work well with Japanese. This is the code for the decoding. voice_command = email....
0
votes
0answers
10 views

How to deduplicate biding file?

Recently I have crawled many bids file (in Chinese)and save in my database. Because the bids files come from different websites, many of them are repeated(for example Google want to buy a new search ...
0
votes
0answers
12 views

Skip kigou characters in Kakasi

When transcribing Kanji to Romaji with Kakasi, the UTF-8 input could have symbols (kigou) like . In this case, kakasi running with echo "" | kakasi -Ha -Ka -Ja -Ea -ka -s -iutf8 -outf8 will output:...
0
votes
0answers
40 views

How to implement fuzzy search for Chinese pinyin and Japanese romaji?

I have some data in Chinese and Japanese, and I want it possible to search by their romanizations (Pinyin for Chinese, Romaji for Japanese). Assume that the romanizations are already provided, ...
0
votes
1answer
47 views

How to fetch and fs.writeFile a webpage with SJIS/Shift_JIS encoding

I am trying to fetch a page from the internet then save it into a HTML file. The page has this in the header: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ja" > <head> <meta ...
1
vote
1answer
71 views

Chinese characters in bash variable are handled differently than in file

I'm receiving Chinese characters in curl outputs and then feeding them as inputs to a Python script, but I get two very different behaviours depending on how I handle the characters. The method which ...