I've written some progressively faster word counting programs. First, we'll start with Python, and then we'll drop down to C, and finally, we'll use single instruction, multiple data (SIMD) programming to go as fast as possible.The task is to count the words in an ASCII text file. For example, Hello there! contains 2 words, and my 1 GiB benchmark text file contains 65 million words.At a high level: read bytes, scan while tracking minimal state, write the count to stdout.These are the results from my Apple M1 Pro that I'll dig into:Python (byte loop): 89.6 sPython + re: 13.7 sC (scalar loop): 1.205 sC + ARM NEON SIMD: 249 msC + ARM NEON SIMD + threads: 181 msYou can also jump straight to the source files.First try (89.6 seconds)Here's a reasonable first attempt, but you might spot some obvious performance-related deficiencies.We read each byte and check if it's part of a set of whitespace characters, while tracking the word count, and whether there was previous whitespace.# 0_mvp.py ws = set(b" \n\r\t\v\f")prev_ws = Truewords = 0with open(sys.argv[1], "rb") as f: for byte_value in f.read(): cur_ws = byte_value in ws if not cur_ws and prev_ws: words += 1 prev_ws = cur_ws print(words)This program is horrendously slow. It takes 89.6 seconds on my Apple M1 Pro. Python code runs for every byte, incurring interpreter dispatch and object checks again and again.Using CPython efficiently (13.7 seconds)There's a big improvement we can make before having to leave Python behind. We can make the program faster by making sure all the work happens in C, in tight loops, with no per-byte Python overhead.CPython's re module is a thin Python wrapper around a C extension named _sre, "Secret Labs' Regular Expression Engine." Patterns are parsed in Python into a compact bytecode, then executed by the C engine. So a call like re.finditer(pattern, data) spends nearly all of its time inside C, scanning contiguous memory with pointer arithmetic and table lookups.# 1_c_regex.py words = 0with o...
First seen: 2025-08-17 05:33
Last seen: 2025-08-17 13:34