Exploring Large HTML Documents on the Web

https://news.ycombinator.com/rss Hits: 7
Summary

Most HTML documents are relatively small, providing a starting point for other resources on the page to load. But why do some websites load several megabytes of HTML code? Usually it’s not that there’s a lot of content on the page, but rather that other types of resources are embedded within the document. In this article, we’ll look at examples of large HTML documents around the web and peek into the code to see what’s making them so big. HTML on the web is full of surprises. In the process of writing this article I rebuilt most of the DebugBear HTML Size Analyzer. If your HTML contains scripts that contain JSON that contains HTML that contains CSS that contains images – that’s supported now! Embedded images Base64 encoding is a way to turn images into text, so that they can be embedded in a text file like HTML or CSS. Embedding images directly in the HTML has a big advantage: the browser no longer needs to make a separate request to display the image. However, for large files it’s likely to cause problems. For example, the image can no longer be cached independently, and the image will be prioritized in the same way as the document content, while usually it’s ok for images to load later. Here’s an example of PNG files that are embedded in HTML using data URLs. There are different variations of this pattern: Sometimes it’s a single multi-megabyte image that was included accidentally, other times there are hundreds of small icons that added up over time I saw a site using responsive images together with data URLs. One goal of responsive images is only loading images at the minimum necessary resolution, but embedding all versions in the HTML has the opposite effect. Indirectly embedded images: Inline SVGs that are themselves a thin wrapper around PNG or JPEG Background images from inlined CSS stylesheets Images within JSON data (more on that later 😬) Here’s an example of a style tag that contains 201 rules with embedded background images. Inline CSS Large inline CSS i...

First seen: 2025-12-02 22:55

Last seen: 2025-12-03 04:56