Introduction
For a comprehensive overview of my open-source projects, please visit my GitHub.
I am Ayaka, a 23-year-old computer science, historical linguistics, and mathematics researcher.
NLP
I am driven by my lifelong goal of designing powerful artificial intelligence that can understand and process a diverse range of languages, both widely spoken and under-resourced. My strong background in computer science has allowed me to gain a thorough understanding of the architecture of state-of-the-art NLP models.
I have made significant contributions to NLP, including the development of TransCan, an English-Cantonese machine translation model that outperforms state-of-the-art commercial products by 11.8 BLEU. I implemented the BART model from scratch using JAX, establishing a versatile codebase for future deep learning model architecture research. Moreover, I created the LIHKG Cantonese Dataset through a scraper that bypassed all 13 layers of Cloudflare’s protection, resulting in a corpus of 172 million unique sentences.
Recent advancements in NLP technology have been particularly exciting, including the release of ChatGPT, a highly advanced text generation model with strong reasoning skills. As one of the first users of ChatGPT Plus, I am thrilled to be at the forefront of this cutting-edge journey.
Nevertheless, even the best NLP models today are unable to process low-resource languages, despite the fact that humans are capable of adapting to these languages through appropriate methods. My long-term objective is to find a transfer learning method that enables AI to pick up low-resource languages without human supervision, leveraging the vast amounts of data available on the Internet.
Linguistics
As a historian of languages, I am deeply interested in exploring the evolution of language and how ancient languages from different geographical regions and ethnic groups have developed into what they are today and what they may become in the future.
To address this problem, I believe it is essential to have a deep understanding of the history of various languages. With regard to the Chinese language, I have utilised my mastery of the Qieyun phonological system to develop the widely-used qieyun-js programming library. I also possess a deep knowledge of the history of Japanese kanji pronunciation, with the ability to read any text written in kanji using the old Japanese pronunciation of on'yomi.
To attain an intuitive sense of language evolution, I have been studying a range of similar languages and dialects, incorporating the knowledge of anthropology and ethnology. For example, in addition to my proficiency in Mandarin and Cantonese, I am also familiar with the Hakka and Teochew dialects. I am also simultaneously studying the North Germanic languages of Norwegian, Danish, Swedish, Icelandic, Faroese and Old Norse.
By leveraging my extensive knowledge of language evolution, I am poised to make valuable contributions to the field of historical linguistics and shape our understanding of the future of world languages.
Mathematics
Mathematics has always been a passion of mine and I have dedicated myself to exploring a wide range of areas including topology, group theory, number theory, category theory, type theory, computability theory, and probability theory.
In my research, I am particularly interested in formalising mathematical theorems through theorem provers, specifically using the Lean proof assistant. I believe that this approach not only deepens our understanding of mathematical concepts but also opens up new avenues for discovery in mathematics.
Unfortunately, due to limited time, I am unable to provide individual supervision at this time. However, I am always eager to collaborate with other researchers who share my passion for mathematics and welcome the opportunity to work together.
If you would like to learn more about my work or discuss potential collaboration opportunities, please feel free to reach out to me. I look forward to hearing from you!
Miscellaneous
Aside from my professional pursuits, I am passionate about anthropology, ethnology, mythology, typography, and astronomy. In my leisure time, I engage in a range of diverse hobbies that help me stay grounded and inspired. These include singing and plane-spotting.
In my use of technology, I have made several unique choices. I use the PWBRHK keyboard layout on my mobile phone, instead of the conventional QWERTY layout. This layout, which I designed specifically with linguistics in mind, is optimised for typing Cantonese and also supports special letters used in Nordic languages.
I currently reside in Singapore, but I observe the Australian Eastern Standard Time (AEST). This is due to my love for Australia, as well as its positive impact on my overall health and well-being.
Additionally, I use my self-designed Nya calendar alongside the Gregorian calendar. The Nya calendar takes into account not only the orbit of the Earth, but also that of the Moon and Mercury, and boasts several advantageous features.
In line with my technological preferences, I have chosen to use Arch Linux as my daily desktop operating system. Arch Linux is a lightweight and highly customizable Linux distribution known for its rolling-release model, up-to-date software packages, and minimalistic approach. By maintaining only one copy of a software package, which is shared among all packages that require it, Arch Linux keeps my system uncluttered and reduces redundancy.
As a Baroness of the Principality of Sealand, I am honoured to hold a noble title and play a crucial role in shaping its future. My role is a privilege that enables me to represent its interests, promote its sovereignty, and embody its values and principles. I am proud to stand at the forefront of this self-determined entity, paving the way for a new era of independence and self-expression.
Posts
- Visualising the Computational Graph of a JAX Program
- A Simple Introduction To Data Parallelism In JAX
- Print the Results of JIT Compilation in JAX
- Connect to Google Colab Using SSH
- Installing Packages on Linux Without sudo Privilege Using JuNest
- The Death of Traditional Search Engines
- The World Citizen Manifesto
- The Benefits of ChatGPT in Education