I’ve recently made some significant updates to the project.
Updates to the HDOC Format
First of all, I’ve updated the HDOC format. You can view the latest format description here. The visible title now sits in the header section along with other visible metadata like the author and date — separate from the content. Changes in the header section no longer break floating links that may exist in the content section.
Parsing Arbitrary Web Pages
It’s now possible to create floating links between HDOCs and arbitrary web pages. These pages are parsed locally and converted into HDOCs. There are two types of parsing:
- Regular parsing is for everyday use. It lets you download web pages in a readable format, but it’s not suitable for creating floating links. In fact, I’ve disabled viewing floating links on such documents when the main HDOC (that contains floating links) is downloaded from the web. You may still view floating links when connecting to your locally created documents to HDOCs created with regular parsing.
- Deterministic parsing uses a special notation in the URL (after the
#sign) to specify exactly how a document should be parsed. This ensures that readers always see a page parsed exactly as intended, with no broken floating links caused by differences in how client apps handle parsing.
Deterministic parsing may require some extra work and, in some cases, may not be possible to use. That’s why, for casual reading and downloading, I use text-density-based regular parsing instead.
WordPress Trick
I’ve also implemented a trick that allows the system to use the WordPress API for WordPress sites (about 40% of the web). This produces a deterministically parsed document that can even include a comments section with paginated loading.
JSON Format for Comments in HDOCs
Speaking of comments — I’ve added a new JSON format for them. In fact, I suggest using the same format WordPress uses, but universally across all websites. If it works well for 40% of the web, it should work for the rest too. In the future, I plan to add visible connections between comments and the main text of each article.
Embedded HDOCs
I’ve also introduced embedded HDOCs. Originally, HDOCs were meant to be served from separate URLs, distinct from the original page URLs. But then I realized — why create extra endpoints?
Instead, we can embed JSON containing all the necessary information directly in the page, marking the content with a special class. The client app can then construct the HDOC locally, while keeping the original appearance exactly as the author intended. This way, each article only needs a single URL.
Standalone HDOCs are still supported by the app. WordPress plugin only serves them if the original page is disabled.
Browser Extension
I didn’t mention it in my “Plans for 2025” article, but I always knew I’d eventually build a browser extension to make it easier to collect links, images, and documents from the browser into the app. So — I built it.
Dropping sw:// and sws:// URI Schemes
Before the app gets any users, I decided to drop the custom URI schemes.
Originally, before the extension existed, I thought users without it would need a way to send HDOCs from the browser into the app — hence the custom schemes. But now, the extension makes that unnecessary. It’s too convenient not to use, and it handles all data transfer between the browser and the app without any custom URI schemes.
Extension Instead of Browser (change of plans)
In my “Plans for 2025” article, I mentioned that I was planning to create a browser that supports the new data formats. But recently, I decided it makes more sense to add that functionality to my existing browser extension instead.
First, this approach means I don’t have to compete with existing browsers.
Second, with the introduction of embedded HDOCs, the extension can now detect them and simply replace the original page with the HDOC version.
For loading additional pages, the extension will connect to the LZ Desktop app, which will act as a proxy server. Since browser pages can’t directly make requests to arbitrary websites, the desktop app will help work around that restriction.
So that’s just a change in my plans. I haven’t yet implemented this functionality in the extension.
Other Improvements
There are many other smaller changes I’ve made to the project, which I won’t go into here.