Commit ce280c9
committed
Speed up
The changes to provide a public API had some performance related costs
of about 1% runtime. There is no trivial way to offset this any
further without undermining the API we are building. However, we can
pull performance-related shenanigans to compenstate for the cost
introduced.
The codespell codebase unsurprisingly spends a vast majority of its
runtime in various regex related code such as `search` and `finditer`.
The best way to optimize runtime spend in regexes is to not do a regex
in the first place, since the regex engine has a rather steep overhead
over regular string primitives (that is the cost of flexibility). If
the regex rarely matches and there is a very easy static substring
that can be used to rule out the match, then you can speed up the code
by using `substring in string` as a conditional to skip the
regex. This is assuming the regex is used enough for the performance
to matter.
An obvious choice here falls on the `codespell:ignore` regex, because
it has a very distinctive substring in the form of `codespell:ignore`,
which will rule out almost all lines that will not match.
With this little trick, runtime goes from ~5.6s to ~4.9s on the corpus
mentioned in #3419.codespell:ignore check by skipping the regex in most cases1 parent 3c08c9b commit ce280c9
1 file changed
+6
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
113 | 116 | | |
114 | 117 | | |
115 | 118 | | |
| |||
177 | 180 | | |
178 | 181 | | |
179 | 182 | | |
| 183 | + | |
| 184 | + | |
180 | 185 | | |
181 | 186 | | |
182 | 187 | | |
| |||
0 commit comments