>This is the program I will be using for demonstration purposes
>Never comes up again for the rest of the post.
>Let's show you how to profile code.
>Also, here's a bunch of unprofiled suggestions with such precise and helpful comments as "slow" and "fast".
>Python haters always say, that one of reasons they don't want to use it, is that it's slow. Well, whether specific program - regardless of programming language used - is fast or slow is very much dependant on developer who wrote it and their skill and ability to write optimized and fast programs.
This is so ridiculous it's honestly laughable.
It's such an obvious falsehood that the only explanations is either the person is truly this clueless, or else they are wilfully spewing bullshit. A bare metal language like C/C++ will of course let you do things faster than a heavy dynamic language like Python.
The mental gymnastics people do to justify not learning another tool.
You know what they say, if all you have is a hammer, everything looks like a nail.
>First rule of optimization is to not do it.
If this person is representative, this explains why computers are hundreds of times faster but most software feels slower than in 1999.
I think they are representative of a lot of developers. With the continued pace of chip development for the last 35 years, there hasn't been a continuing need to program for performance - in general, unless programmers did something very dumb or were dealing with large amounts of data, they could just write the way they wanted and let the hardware handle making their program fast.
Contrast this with early computer games - to get the best performance some games would actually boot your computer without an OS, sacrificing some convenience to get the last few percent of speed needed out of the system because it was the only way to out perform the competition.
One reason there's such opportunity in the present state of CPU technology (clock speeds have halted at about 4 Ghz in favor of more cores) is that few people remember how to program for performance, and those that do are handicapped by a bloated OS built for profit rather than value.
The world needs tons of software, and the vast majority of that software just needs to do some things right some of the time, and an average Java EE developer toiling away in a cubicle is good enough to deliver it.
Writing efficient software is a HARD problem, and it doesn't make economical sense to actually write efficient software, it makes sense to write just good enough software and throw hardware at it. For the price of a developer-year you can provision hundreds of machines to run that piece of code.
> So, let's prove some people wrong and let's see how we can improve performance of our Python programs and make them really fast!
> [sets the stage with a program that takes 11 seconds to run]
> This is more about general ideas and strategies, which when used, can make a huge impact on performance, in some cases up to 30% speed-up.
...that's it? up to 30% speed-up is "blazingly fast"? On a program that takes 11 seconds to run, that still takes 8 seconds... I was expecting speed-ups that took execution to milliseconds or microseconds.
$ time python -c pass
Kidding aside, even just running an empty C program that does nothing still takes 5ms, so not just a Python problem...
This is so important to listen to .
Some people just can't accept there is a right tool for a job. And filling that hammer down to a needle is not something to be proud of.
The first example given (the exponential function) is basically the worst scenario, because it's a purely numerical computation expressed in pure Python code. Whereas Python's performance is okay-ish for I/O or calling C modules.
From doing Project Euler solutions, I have ample evidence that for pure numerics (e.g. int, float, array), Java is anywhere from 10× to 30× faster than pure Python code executed in CPython. https://www.nayuki.io/page/project-euler-solutions#benchmark...
I believe it is basically impossible for Python to win back all that performance loss without adopting radical and jarring features like static typing, machine-sized integers, and no more "every number is a full-fledged object".
The problem is that projects like PyPy remain on the sidelines, instead of being fully embraced by the community.
Most people doing serious numeric work don't care about the speed of the Python interpreter because all the heavy lifting is done by optimized libraries like Numpy and TensorFlow.
This is nonsense. No language ever will have linear algebra or numeric implementations faster than Fortran/C implementations of BLAS, LAPACK etc. Not being able to make ffi calls makes python essentially unusable for most of its niche uses.
> JS has exactly none of the features that comment claims would be necessary for performance.
I don't see people doing ML, scientific computing etc in JS.
I could even use them from TCL if bothered to do so, thus Python's benefits as programming language are meaningless.
Tensorflow has multiple language support for example.
Being the most flexible and user-friendly interface to the most powerful libraries written in other languages isn't meaningless.
Python already has it, PyPy, but it tends to be ignored by many.
LuaJIT is stuck on a 2017 release, and an old Lua version, is it ever going to be updated?
People don't normally update to new versions of Lua, because they're not backwards compatible; it's not like Python or JS. WoW is still using Lua 5.1, as is MediaWiki. It's unlikely that this will ever change.
Probably at some point someone will continue LuaJIT development. It seems as likely that they will diverge in a different direction as follow PUC.
Chez does unboxed integer arithmetic (but not floats) and does not have to do any OO-like dispatch, and is also probably one of the best language implementations there are.
Is it? Python is really, really dynamic, which contributes to its slowness. You can directly change an instance's __class__ attribute. You can add properties to classes dynamically, changing how fundamentals of how attributes get looked up at run time. You can write a new class, using a new metaclass, and then set an existing instance to the new class.
A great deal of why Python is so slow is that it is really too dynamic. A language doesn't really want to be "as dynamic as Python".
As an example, Smalltalk becomes: message completely replaces one object representation by another one.
Most modern lisp compilers do a lot of different things to make CLOS fast, though, prefilling caches and all that for you. Not only that, you can connect to a running program and redefine it while it is running.
In that case, the answer is actually no to my question. Yes, of course you could program in that level of dynamicness, because you could in any language, but it will then slow you down. No sensible CLOS would be as dynamic as Python.
Like I said, in a lot of ways, you don't want to be as dynamic as Python, and I advise against language advocates seeing the phrase "Python is more dynamic than your language" as a cue to jump up and start insisting that they are just as dynamic as Python. Even in hindsight, I'd say the level of dynamicness in Python was a mistake. You don't need it to have a nice, usable, dynamic language, but it has been a ball & chain around its legs in terms of performance for decades.
To be clear, this isn't a criticism of dynamic languages as a concept. I have criticisms, but these aren't it. This is a criticism of Python specifically. A dynamic language can be pretty nice with, let's say, two or three layers of dynamicness, but Python has four or five. If you follow the full process that Python has to go through to resolve "x.y", including all possible points where you might have done something to affect the result, it's crazy overkill. In Guido's defense, when he was writing it way back when, that wasn't clear. There wasn't a lot of highly-relevant prior art to look at for that style language.
I think a lot of CS students had their brains melted by tough classes where Scheme was the vehicle. Thus, you have a population of users that either are anti-evangelists because they have PTSD, or you have evangelists who exist on a plane above the lumpenproletariat like me, which can contribute to the false notion that Scheme is intrinsically esoteric.
I never got a formal CS education, so I don’t have PTSD from Scheme. Also, I am not a super brain, so I guess I haven’t felt compelled to become expert at the hard concepts that Scheme enables.
People fixate on the S-expression syntax (“all those parentheses!”; counter argument “way fewer commas!”). But I think the real issue for Scheme is the lack of libraries that do hard things for “normals” like me.
If I’m strictly honest, I’m more productive in Python than Scheme. This is not because Python is easier. It’s because the Python community has attracted the CS grads who grokked enough of the hard stuff to make libraries that abstract away stuff.
There’s no reason people can’t write Scheme like they write Python. That is, people don’t need to do all the possible stuff in Scheme all the time. Truthfully, Scheme is at least as easy as Python.
Scheme just needs more smart normals writing libraries for mediocre normals like me for Scheme to become popular. Maybe take a domain approach. I feel like adapting R’s tidyverse to Chez is an easy target. Scheme could be the data scientist’s goto. Maybe show people how easy it is to build self-contained serverless apps in the cloud.
If there were a Scheme community that earnestly tackled any domain with the idea of making it accessible to practitioners within the domain, I think it could get real traction.
And it would be fast. Much faster than Python.
My experience is that simple programs usually take about twice as much code in Scheme. This is related to the polymorphism thing, but also I think Python supports imperative programming better. Python is built around autogrowing hash tables (“dicts”) and autogrowing arrays (“lists”); standard Scheme provides neither, preferring alists and cons lists.
(define-method (get (lst <list>) (n <integer>))
(list-ref lst n))
It would of course slow things down since I doubt chez optimizes that so you get runtime dispatch. SBCL however has an amazing object systems that does all of python's Oop with pretty good speed.
You are right that python is better at imperative programming, but that is also very much a matter fo taste. I always get a sour feeling in my mouth when using python because it is almost exclusively about mutability.
(hashtables are a part of r6rs, btw. Not very pleasant to use since: (hashtable-ref h key default-value))
A number quoted is that SBCL preforms roughly on par with, but slightly slower than, java. For what it is worth, the computer benchmarks game confirms that: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...
That was with optimize level 3, which burns on (car 1), and then you might as well use C, but the language at least allows for those speeds.
SBCL is also an amazing project, and lately the GC story has gotten better IIRC.
"… Racket CS available, a beta version of the Racket on Chez Scheme implementation."
And you get all of this WITHOUT static typing and yes in Julia everything is still a full-fledged object. There is no difference between numbers and other objects in Julia. But numerics is still fast.
My biggest qualm with Julia (and maybe this speaks to my inexperience with the language) is that it isn't always obvious when Julia is going to make a copy. We spent about an hour working through some code that was very slow (props to Julia's profiling tools) but couldn't figure out _why_ it was slow. It turned out that despite our best efforts, Julia was still copying a vector despite us using pre-allocated scratch space for the work.
From my point of view, if I am comparing algorithms then Python's performance doesn't really matter and it's ergonomics win. If performance matters I'd just use C++ or Rust.
When Julia makes a copy is pretty straightforward and natural IMHO. I would have been curious to see an example of the code you used where a copy was made without you knowing.
I started with Python, but I find Julia better is almost every single way I can think of. Like even if Julia was slower than Python I would have picked it because I find it so much nicer to use.
I wrote an article here about some of the observations I had about using Python after coming back to it from Julia:
There are some exchanges further down. Would have been interesting to hear your feedback on some of those things.
If you're talking about slices of an array, those always create a copy unless created with the @views macro (or the equivalent function call).
It's not as nice as Python, nor as fast as C++. And much less supported (tools/libraries/...) than both.
So it sits in this awkward middle between Python and C++, basically sucking at both and excelling at none.
With Julia I get first class meta programming. I get awesome multiple dispatch. I get environments and package system really well integrated. I get awesome integration with the shell. Better module system. More natural syntax for arrays. Much better system for closures. Better named functions.
REPL programming in Julia is just light years ahead of anything in Python. The OOP design of Python really kills the REPL experience.
Unless you are a very skilled C++ programmer, Julia is going to outperform you as the program gets larger. C++ programmers are going to get themselves tangled up when trying to run multi-threaded code, running on multiple machines on GPUs and specialized hardware. Julia does this effortlessly.
C++ cannot do JIT, hence as soon as you deal with complicated machine learning algorithms with custom kernels, C++ is going to tie itself into a knot.
Why do you think large Astronomy projects like Celeste and the next major climate models are built in Julia and not C++? Because developers realized that when you need to run massive calculations on super computers on hundreds of thousands of cores, C++ is going to get in the way.
As for libraries and tools. All the Python tools I have tried to match my Julia tools have just sucked. Julia tools often excel over much older Python tools.
Library development moves much faster on Julia than Python. It is not hamstrung by relying on complicated C++ code based. Also Julia libraries integrate very well, while Python libraries are often their own deserted island. That means a few Julia libraries can do what must be accomplished with dozens of Python libraries.
Main difference being: High memory bandwidth vs. heavy usage of cache, unified programming model for vectorization and multicore parallelism
Generally I think a performance/cost comparison is more useful: Take the price of the GPU and compare it to something with equivalent cost in CPU+RAM.
I find this hard to believe. What kind of numerical work are you doing? Even something as simple as matrix-matrix multiplication should be hard to beat with C, unless your C code is using a cache efficient algorithm.
People always say "use numpy", but that is only possible if your algorithm can be described in terms of vectorized operations. For many kinds of processing, the only alternative is C/C++ (through Cython)
> People always say "use numpy", but that is only possible if your algorithm can be described in terms of vectorized operations. For many kinds of processing, the only alternative is C/C++ (through Cython
My personal experience is that you can actually get another factor of 2 or 3 speed-up by ditching Cython and using actual C instead (I think it's because optimizers have a hard time cleaning up the C that Cython produces), even if you've turned off thing's like bounds checking.
I guess you haven't tried it, then. But your lack of knowledge is not a reasonable justification for attacking my integrity.
> Even something as simple as matrix-matrix multiplication
That's the best case for Numpy, not the worst. SGEMM is indeed just as fast when invoked from Numpy as when invoked from C, at least for large matrices.
CPU SIMD code can be trivially mixed with non SIMD code but mixing GPU and CPU code may negate the benefits.
Pypy is often quite a bit faster than CPython for real-world programs so clearly some improvement is possible.
Two options you can try guiding a budding python user to are Nim and F#
Short: they're not portable beyond Windows.
F# runs on Windows/Linux/Mac standard since .NET Core, and Mono runs it anywhere, Nintendo Switch, IOS, Android, playstation.
I have to say, the desperate lengths Python programmers will go to to use it for things it was not meant for rather than learn or use other languages is one of the aspects I most dislike about it. However fast you make it, the same effort would have made it that much faster again in a performant language.
Then at some point if Python isn't needed because you know exactly what you want your software to do, rewrite it in C++ or whatever.
Also with CFFI and other interoperable libraries, it's really quite easy to write some heavy work in a more appropriate language and call into it.
If you already know Python, and Python packages already do all you will ever need then sure stick with that. But I don't get why people would go to such lengths to avoid using a new language. Being proficient in Julia is a lot less work than maintaining proficiency in Python and C++.
I don't mean to discredit the advantages Julia clearly has over Python, but these are just the kinds of problems that make people like me stick with tried and tested last-gen languages like Python.
A lot of the issues are simply that people have not learned a sensible workflow with Julia. Python guys have a lot of habits that don't translate well to Julia. I know because I work daily with two hardcore python guys. I notice all the time how we approach problems in very different ways.
Python guys seem to love making lots of separate little programs they launch from the shell. Or they just relaunch whole programs all the time.
In Julia in contrast you focus on packages from the get go and you work primarily inside the Julia REPL. You run Revise.jl package which picks up all the changes you make to your Julia package.
I guess it just depends on the workflows you are used to. For me it is the opposite. Whenever I have to jump into our Python code base I absolutely hate it. It is very unnatural for me to work in the Python way. I also find Python code kind of hard to read compared to Julia code.
But I know Python coders have the opposite problem. Basically Python guys look a lot at module names when reading code. Julia developers look more at types. The difference makes some sense since you don't really write types in Python code.
I found that the new Python type annotation system helped me feel at home in Python.
That's the problem, nobody is actively learning Cobol these days and nobody knows how much is running out there in the wild, because none of the big banks or credit companies will actually admit it.
You joke but the jobs are still out there, still being posted and companies are still hiring. https://www.wellsfargojobs.com/job/irving/apps-systems-engin...
Elixir and Erlang both lack a type system and maintain a focus on keeping the language constructs simple rather than providing various high level abstractions some other more expressive languages have.
I don't think it's an issue of not wanting to learn other languages.
If you really like screwdrivers and go to screwdriving events, have a collection of screwdrivers, you may end up living in a echo chamber where you and your peers are convinced using a screwdriver is a good tool to plant a nail.
If you're trying to make something go 'really fast' these days, that means either (a) some kind of vectorization, or (b) pushing work onto a GPU. In either case Python is unlikely to perform meaningfully worse than any other language, since the host language isn't doing much anyway.
This is an unfortunately common misunderstanding of the phrase: "premature optimization is the root of all evil."
Optimization is a crucial part of developing successful software. It can be harmful to get overzealous with certain types of optimization, however basic wins like using string builder primitives or formatted strings from the outset is hardly premature. Some optimizations can only be realized at the early conceptual stages too; going for those early on isn't always premature.
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”
It would not have mattered that I could have provided a rational explanation for why that was a rational choice in that instance. They would have just kept reciting scripture and called me a heretic.
Meanwhile people will let you you commit the worse most unmaintainable code, as long as it doesn't break any the 10 commandments of coding or whatever the equivalent would be.
I wonder if using string builders is really a critical 3%... and how many people who do practice premature optimization actually measure if their optimization of choice is in their program's critical 3%.
In languages that have immutable strings, a chain of `+=` operators is basically O(n^2) vs O(n) for a string builder. For how easy the optimization is, there's little excuse to not use them for any bulk append operations.
rv = 
for x in y:
More recently, though, I've often been preferring the following construction instead:
for x in y:
However, if you're optimizing a deeply nested string generator, you're better off using the list approach and passing in the incomplete list to callee functions so they can append to it. Despite the suggestive syntax, at least last time I checked, `yield from` doesn't directly delegate the transmission of the iterated values; on this old netbook, it costs about 240 ns per item per stack level of `yield from`. (By comparison, a simple Python function call and return takes about 420 ns on the same machine.)
But if you really wanted your code to run fast you wouldn't have written it in Python anyway. You'd've used JS, LuaJIT, or Golang. Or maybe Scheme. Or C or Rust. But not Python.
This is why you really should always benchmark. In my view, "premature optimization" is not so much about optimizing too early in a project, it's about writing code a particular way you assume will make it faster without testing first.
I agree that you shouldn't operate on assumptions alone for a decision like whether or not you should use a string builder. That's where prior experience should come in to play to guide your decisions. For instance, I am not a JS developer, so I have no prior experience to inform a decision to use a builder vs concat in JS.
I cited that case in particular since the slowness of concatenation was called out in the article, and in some languages it actually does make a huge difference at a very small complexity cost.
Don't assume a certain way of coding is faster because you read it on the Internet, actually profile your code.
I am amazed and saddened, that in 2020, concatenating strings, regardless of the form or language, is not blazing fast across all environments.
Python 3.7.4 (default, Jul 28 2019, 22:33:35)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.8.0 -- An enhanced Interactive Python. Type '?' for help.
In : class X:
...: def y(z):
...: return z.a + z.a + z.a + z.a + z.a
...: def w(z):
...: a = z.a
...: return a+a+a+a+a
In : x = X()
In : x.a = 3
In : x.y()
In : x.w()
In : %timeit x.y()
1.7 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In : %timeit x.w()
1.11 µs ± 5.95 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Note that here we are comparing two instance attribute accesses against seven, not zero against five. Evidently each of them cost about 118 ns, so if we could reduce them to zero, the method call and return and four additions would cost only 870 ns, which is closer to half the runtime than ¾.
Moral: benchmark before pooh-poohing a hotspot.
Also though note that several thousand instructions is a pretty heavy price to pay for four integer additions.
It can but it will more likely account for much much less than that, unless all your programs are massive loops that do little more than access the same attribute repeatedly.
Using your definition of class X
313 ns ± 18.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
__slots__ = ('a')
a = z.a
271 ns ± 7.13 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Also you are missing a comma in your would-be tuple.
Edit: I tested it and there is a difference
I picked the example you mentioned, defined the regex constant as "s" and the line constant as "asas", increased the iteration count from 10^4 to 10^6 to make the difference more noticable (got no noticable difference w/o that change) and measured the programs with time in Termux on my Nexus 5.
Real time for fast: 3.150, real time for slow: 3.623. Times average of three runs ("time python fast|slow.py"). I ran once with same setup and threw out results before the runs.
Edit2: actually you mentioned a different example, my bad. I didn't measure any others because vim with a touchscreen keyboard is a PITA, no idea if the one you referring is true.
I wouldn't call that blazingly fast python though, that's barely approaching C, which is also slow. Fast is maxing out SIMD + cores + GPUs + bandwidth, so should aim for ~20X+ faster than regular C...
You can't even know if you are dealing with micro-optimization or getting any evidence that stuff like that helps.
I wrote a tool py-spy (https://github.com/benfred/py-spy) that is worth checking out if you’re interesting in profiling python programs. Not only does it solve those problems with cProfile - py-spy also lets you generate a flamegraph, profile running programs in production, works with multiprocess python applications, can profile native python extensions etc.
Side note: I also used pyreverse, now part of pylint, to diagram entire projects and get a class hierarchy. It helped tremendously in refactoring and decoupling code through whole projects, finding redundancies, and have a better architecture.
I'll have a look at py-spy. Thanks for that.
At our company, py-spy has helped us a lot for our line-of-business application. I'm not affiliated with Ben in any way, but he deserves some praise for his work on py-spy.
Stopped here immediately. I have been writing software for more than 20 years, mainly in C++ and Python. No professional would start this kind of discussion with this childish attitude (apart from the fact, that content-wise the problem was beaten to death for decades).
As a Julia developer I see this a lot. You point out Julia advantages and the Python guy will respond with: Oh I can do that in Python to if I use package X, Y, Z combined with feature A, B, C. Basically their response to a simple well engineered feature is a complete mess of a solution. But hey they prefer that because they can still stick the label Python on top of it.
I admit I also get set in my ways, but at least I like to think that when I dismiss another language it is not for purely silly reasons.
I’m pretty sure you’ll also occasionally bang nails for which Julia is a poor hammer, you just don’t realize it.
But I kind of keep a collection of favorite languages under my belt which cover different areas. My favorites are probably Julia, Go, Swift, Python, Lua and LISP in that order.
If I need more low level style coding I would go with Go (pun not intended). Swift is nice if you want to actually want to make GUI applications and something that is quite robust. The type system in Swift is quite good at catching many problems.
Otherwise you end up with a project which uses all of Python/Perl/Java/Julia/MATLAB/R/C++/Fortran/Rust/Go, because hey, for this particular problem X has the best solution so lets use that.
People are often WAY WAY too reluctant to rewrite code. Instead they spend years maintaining garbage.
I remember rewriting an iOS app from Objective-C to Swift. Everybody thought it was a waste of time and should not be done. People tend to only think about what is of immediate benefit.
I only rewrote the most important parts. About 30% remained in Objective-C. Once it was in Swift lots of developers suddenly started getting interested in joining. They loved working with Swift and made lots of contributions.
But then they hit the Objective-C parts and where bummed out. All the guys who had said rewriting to Swift was a waste of time was now complaining about the existence of Objective-C code and that we had to get rid of it.
So I rewrote the rest. The point is that, people seldom realize how much benefit a better language can be until they actually start working on a code based written in a better language. Then they will often start hating the very code base they had previously defended.
Think of the millions of lines of Cobol code stuck on mainframes which is almost impossible to maintain today. We are stuck with that because at every juncture where there was a chance to upgrade and switch to a more modern technology, somebody made a variant of your argument.
Sure you cannot have free for all. But it most be possible to have a sensible process where you experiment with some alternatives. Evaluate the pros and cons and then switch to the better choice.
> I like to think that when I dismiss another language it is not for purely silly reasons.
One should also to choose a suiting language, if you do not want to risk falling victim to the same attribute.
This article is written by someone who obviously doesn’t know much about CS.
Please HN community, try to not upvote these, it’s a waste of time for all of us.
"I'm (mostly) not going to show you some hacks, tricks and code
snippets that will magically solve your performance issues. This
is more about general ideas and strategies, which when used, can
make a huge impact on performance, in some cases up to 30%
Python CPU-bound programs are e.g. 30 times slower than C or Java, and "up to 30% speedup" makes them still 20 times slower which is really far from "blazingly".
An example (best scores):
Or just a modern language with nice quality of life features:
(All of these are seconds)
Python performance varies the most in pl benchmark game.
Resuming a generator in CPython is a lot faster than creating a whole new function call, and especially a whole new method call, contrary to what the article said. But often enough it's faster to just eagerly materialize a list result.
Some other good tips: %timeit, ^C, sort -nk3, Numpy, Pandas, _sre, PyPy, native code. In more detail:
• For benchmarking, use %timeit in IPython. It's much easier and much more precise than time(1). For super lazy benchmarking use %%time instead.
• The laziest profiler is to interrupt your program with ^C. If you do this twice and get the same stack trace, it's a good bet that's where your hotspot is. cProfile is better, at least for single-threaded programs. Others here suggest line_profiler.
• If you have output from the profile or cProfile module saved in a file, you can use the pstats module to re-sort it by different fields. But you probably don't, you have some text it output. The shell command `sort -nk3` will re-sort it numerically by column 3, which is close enough. In Vim you can highlight the output and type !sort -nk3, while in Emacs it's M-| sort -nk3.
• You can probably speed up a pure Python program by a factor of 10 with Numpy or Pandas. If it's not a numerical algorithm, it may not be obvious how, but it's usually feasible. It requires sort of turning the whole problem sideways in your mind. You may not appreciate the effort when you are attempting to modify the code.
• The _sre module is blazingly fast for finite state machines over Unicode character streams. It can be worth it to transmogrify your problem into a regular expression if you can.
• PyPy is probably faster. Use it if you can.
• The standard advice is to rewrite your hotspots in C once you've found them. Maybe this should be updated; Cython, Rust, and C++ are all reasonable alternatives, and for invoking the C etc., you have available cffi and ctypes now. In Jython this is all much simpler because you can easily invoke code in Java, Kotlin, or Clojure from Jython. An underappreciated aspect of this is that using native code can save you a lot of memory as well as instructions, and that may be more important. Consider trying __slots__ first if you suspect this may be the case.
I do that sometimes, but it has some pitfalls. If most of the time is spent inside a C module (for instance in numpy), then the interrupt won't be caught before the C module is exited, which can lead to a wrong stacktrace.
At this stage why are you even using python anyway? The code isn’t going to be very pythonic or readable and the effort would in my opinion be better spent on C++ or Rust.
“If you want your code to run faster, you should probably just use PyPy.” — Guido van Rossum
Personally, I’ve tried pypy without issues, out of curiosity, but in about 15 years of using python never ran into python code as being the performance bottleneck. There are too highly performant modules for everything.
So basically anything where the hot path is in pure Python, rather than a standard library method.
s += num / fact
> Now, re.findall() does cache the last 100 or so regexps, so it probably won't re-evaluate the regex each time. But really, pre-compute that regex with "_my_pattern = re.compile(regex) ... _my_pattern.findall()" and avoid even that cache lookup.
cpburns2009 says its 512 these days, which doesn't change the essence of my comment.
EDIT: The cache appears to be 512 on Python 3.6 so maybe precompiling isn't necessary unless you frequently use a large number of regular expressions.
I find it a good habit to always import modules and almost never (sane exclusions apply) import individual functions from them. If I use something frequently, I'd alias it for clarity (`import sqlalchemy as sa`)
The reason is that otherwise, patching with mocks becomes somewhat tricky, as you'll have to patch functions in each individual importer module separately. Here's an example: https://stackoverflow.com/a/16134754/116546
Maybe that's wrong but my idea is that I don't want to assume which module calls some specific function but just mock the thing (e.g. make sure Stripe API returns a mock subscription - no matter where exactly it's called from). Then, if I refactor things and move a piece of code around (e.g. extract working with Stripe to a helper module), my unit tests just continue to work.
> Based on recent tweet from Raymond Hettinger, the only thing we should be using is f-string, it's most readable, concise AND the fastest method.
I love f-strings, but to best of my knowledge, one can't use f-strings for i18n/l10n, so all end-user-facing texts still have to use `%` or `format`. E.g. `_("Hello, %(name)").format(name=name)`.
The weirdest thing is that they aren't even using python nor it seems that they're being forced to use it currently, making all this... Ranting (there's literally no other word for this) all the more inexplicable.
I don't understand it; I've been using Go for a year now at work. I hate pretty much everything about it, yet I haven't ranted about it in an article about the language for about that time. There's just no point to it.
But Python zealots can be annoying. That's true for any language. Personally I don't like python's asynchronous programming paradigm. Objectively Go does it better than Python.
So unless you are into adopting PyPy, you will be better off with JVM and .NET stacks.
Plenty of languages to choose from, while benefiting from their performance and tooling.
That and has replaced Java in many introduction to programming courses.
Which is good, when learning to programm performance isn't a concern as such.
I know Python since Zope was the only reason to use it, so around Python 1.5 or something.
Other than replacing what I used Perl for, regarding UNIX shell scripting, I never used Python in any scenario where performance might come into play.
There are plenty of options that beat Python's LOC, while providing an AOT/JIT toolchain out of the box.
Performance varies wildly for basic coding decisions across platforms. Especially diff combinations of browser + os.
Im deciding on a name still, was thinking concepts like ‘popular’ from the song by Nada Surf, or photo finish (horse racing), or something like unfortunate/wheel of unfortune, poking fun at the need to have this lib.
Here's a messy example that shows this issue (try it in diff browsers).
Can someone chime in about the L1 cache? The claim is made without measurements, so I am skeptical.
Honestly the quality bar for most things written is python is pretty low so anything that can help people improve is fine. So kudos to the author.
Almost everything the Python interpreter does is ridiculously slow, even for an interpreted language. The language design prevents fast implementations.
 Restricted subsets of Python do not count
 No, PyPy is not fast. It is slow, even for a JIT.
Apparently the fact that the complete world may change at every given moment, and every single operation requires method calls, doesn't impact the existence of reasonable good JIT compilers for Smalltalk, in fact they are the genesis for Java JITs.
Wow, this is the one I couldn't expect. I always wrap the scripts in the main function out of pure perfectionism (or perhaps that's OCD) but the fact a script without it is going to run slower seems counter-intuitive and should really be among the first things taught.
No, it shouldn't. You don't teach a language by discussing micro-optimizations, especially when you're talking about Python.