Last March, I transitioned from full-time writing (i.e., unemployment) back to the “workplace” to develop products. I even had the nerve to ask everyone to subscribe to a service that hadn’t launched yet, and I was fortunate enough to gain the support of 100 Genesis members.
A year has passed, and it is time to piece the fragments together into a complete picture and officially introduce 3ook.com (book-dot-com)—the product for which I am responsible for UX design.
3ook.com
Read, Listen, Own. The 3rd-gen bookstore ecosystem where authors and readers co-create a decentralized home for books and periodicals. By merging AI-enhanced experiences with Web3 ownership, we safeguard the freedom to read and the integrity of our shared history.
If I were limited to fifty words, the above would be my introduction to 3ook.com. However, since our readers seek deeper exploration, I will expand on the topic, focusing this time on the “Read & Listen” and “Utilizing AI” aspects.
Not Audiobooks, but Books with Voice
The most significant feature of 3ook.com is that it does not distinguish between text books and audiobooks. Every book can be switched between reading and listening modes at any time, unless the author chooses to disable the narration feature.
Although the setting of “Integrated Reading and Listening” may seem minor, it fundamentally changes how people interact with books, making the experience more logical and better suited to modern life. Taking my own experience as an example, among the five books I recently “read”—我城零五,時間的試煉, 浮世薔薇, H醫生一千零一夜,豆腐媽媽—I primarily listened to them while commuting, walking, hiking, or lying down, occasionally switching to reading mode as a supplement. Had it not been for this, even though I wanted to read these works, it would have been difficult to find the time and visual “HP” (health points) to finish them given my daily rhythm and eyes fatigued by long hours in front of a screen—not to mention the large volume of other excellent works available.
You might say that listening to books is nothing new, given that Audible has a twenty-year history. Indeed, I occasionally listen to audiobooks on platforms like Audible or Kobo. However, with traditional audiobooks, unless you purchase the text version separately, you cannot switch to reading mode at any time to clarify unfamiliar words, appreciate beautiful illustrations, or view infographics.
Furthermore, the vast majority of books simply do not have an audio version. Taking Amazon as an example, there are a hundred million text books, while its subsidiary Audible has only one or two million audiobooks—a coverage rate of less than 2%. Kobo fares slightly better; among nearly ten million books, about 5% offer an audio version. This means that 95% of titles are left behind. The absence of an audio version is the norm for e-books.
If you are reading popular titles like Atomic Habits or Chip War, finding an audiobook is not a problem. But if you are like me and interested in relatively niche subjects—especially Hong Kong books—the chance of finding an audio version is nearly zero. After all, even if these works spent their entire revenue, it might not be enough to record an audiobook using traditional methods.
Thanks to the rapid advancements in generative AI, 3ook.com pre-samples the voices of voice actors, analyzes their vocal characteristics, and synthesizes the entire text. The results are already quite impressive. While there are occasional inaccuracies in pronunciation or unnatural pauses, to be fair, were it not for these minor flaws, it has reached a point where it is hard to distinguish between a human narrator and an AI-generated one.
Not to self-deprecate, but objectively speaking, traditional e-book platforms are unlikely to hire someone to narrate 區塊鏈社會學 or 財富自由主義. As a niche author, if I wanted an audio version of my work, I had to find a way myself. Now that they are on 3ook.com, my six works and the upcoming 進擊的自由 are finally able to provide both text and audio versions simultaneously.
Designing 3ook.com began with satisfying my own needs, but I believe there are many readers, authors, and cultural preservationists facing the same issues.
Narrating Niche Works in Niche Languages
Just like niche works, niche languages cannot generate sufficient commercial scale.
Whether we are willing to accept it or not, Cantonese is becoming increasingly niche. I have immense respect for podcasters who love Cantonese and narrate Hong Kong books in our mother tongue, but we cannot deny that for most works, the market scale for Cantonese audiobooks is insufficient to support the cost of narration. However, with voice generation technology, there is hope for this problem to be solved.
3ook.com currently offers three default voice talents: travel expert Pazu, and my two partners Phoebe and Aurora. The former two narrate in Cantonese, while Aurora narrates in Mandarin. To be honest, compared to Mandarin, the flaws in Cantonese narration are slightly more frequent, but it is already well above a passing level. Considering that AI models are still improving rapidly, with enough training data, the quality of Cantonese narration will inevitably continue to advance, correcting that final 1% of error.
I am a bit sorry that 3ook.com cannot yet provide a Taiwanese (Hokkien) voice talent, as our resources are very limited. If there are Taiwanese readers who can provide support, we will certainly add it as soon as possible. Although I do not understand Taiwanese, I believe the intimacy of receiving information in one’s mother tongue, and the sentiment of preserving a gradually marginalized culture, is a universal value.
In addition to the default voice talents, 3ook.com launched the “Private Voice Talent” feature earlier this month, allowing members to upload their own or authorized voices to narrate works on their bookshelf. In recent weeks, almost every time I meet a friend, I forcefully invite them to record. A minute later, their voice can be used for narration. Sometimes the voice is so similar it’s indistinguishable; other times the effect isn’t quite ideal yet. The team is using various methods to fine-tune it, and we are confident that the precision of voice cloning will improve further in the near future.
Just imagine: a parent telling stories to their child, or a loved one narrating a novel for you. That “goosebump-inducing” feeling is something only the people involved can truly experience.
I am not an “AI enthusiast” who tries every new model daily, but from beginning to end, I adhere to the principle of “Humanity as the Essence, Technology as the Tool.” Through voice generation technology, 3ook.com aims to give every book a voice, narrated by the most familiar voices in our mother tongue, to preserve culture and keep the emotion alive.
P.S. After a 12-year absence, Stefanie Sun returned to Hong Kong for a concert last week. I attended with the feeling of catching up with my little sister. Seeing a slightly fuller Stefanie looking energetic, I was very happy for this mother of two. During the request segment, Stefanie invited Hong Kong people in the audience to raise their hands and then picked one. As soon as the person spoke, it was rhotic Mandarin—a Chinese person visiting Hong Kong for the concert. The next request was the same, and the third and fourth as well. It was a constant reminder that I was in “Hong Kong, China.” It wasn’t just the few picked for requests; by my “ear-test,” the six people to my right and at least three to my left were all from the mainland. Most of what I heard on the way in and out was Mandarin. It is not just Cantonese that is becoming increasingly niche.


Leave a Reply