Naïve quicksort speeds #haskell #python
I was watching Graham Hutton's lectures on Haskell and he used the naïve straight-forward implementation of quicksort as the first code example, working out what it does.
For fun, I wrote the same in Python, to compare elegance. Running the code for even small lists gave an exception about recursion depth. Fair enough, recursion is common in functional programming and less so in imperative languages like Python (the Haskell version doesn't throw such exceptions).
As it is the naïve version, I had to scramble the list to fix this, as running on the sorted list gives us the worst case (the pivot element is näively taken as the first in the list, so the algorithm splits the list in one element and the rest, oops). Here is the Python code:
from random import sample
def qsort(list):
if list == []:
return list
else:
x=list[0]
xs=list[1:]
return qsort([y for y in xs if y < x]) + [x] + qsort([z for z in xs if z >= x])
n=10000000
g=qsort(sample(range(1, n+1), n))
print(g[-1])
The code is short, but perhaps not what you would call elegant? The list
comprehension part does look quite close to the Haskell version, but the
pattern matching elegance is missing (although recent PEPs are going to
add pattern matching to Python, I am
not sure if it is going to look more elegant). Also, I skipped the
conventional if __name__ == "__main__":
boiler-plate, which would have
made it look worse.
Running it on a list of the integers from 1 to 10 million (scrambled) and printing the last (to not make printing be the limiting factor) takes ~48 seconds on my laptop, and makes the fan spin up.
I was expecting the Haskell version to be more elegant to look at and run faster, as it is - after all - compiled. When I printed the entire list, it was very slow, so again I chose to just print the last entry of the result:
import System.Random
import System.Random.Shuffle
f [] = []
f (x:xs) = f ys ++ [x] ++ f zs
where
ys = [ y | y <- xs, y < x ]
zs = [ z | z <- xs, z >= x ]
main = do
gen <- getStdGen
let g = f (shuffle' [1..10000000] 10000000 gen)
putStrLn (show (last g))
Much to my surprise, running this takes ~55 seconds - around 14% slower than Python?! Usually people talk about Python being slow. I know this is the naïve(st) implementation, but so is the Python one I'm comparing with.
It would be interesting to know what part of the Haskell version the most time is spent in.
Whoa, just changing the last line from:
putStrLn (show (last g))
to:
putStrLn (show (head g))
changes the running time to ~39 seconds. I guess lists really are implemented very inefficiently. Or am I fooling myself with laziness?
This was with Python 3.9.1 and GHC 8.8.4 on an Intel i7-9750H equipped laptop running Debian unstable.
Add comment
To avoid spam many websites make you fill out a CAPTCHA, or log in via an account at a corporation such as Twitter, Facebook, Google or even Microsoft GitHub.
I have chosen to use a more old school method of spam prevention.
To post a comment here, you need to:
- Configure a newsreader¹ to connect to the server
- Open the newsgroup called
¹ Such as Thunderbird, Pan, slrn, tin or Gnus (part of Emacs).koldfront.dk
on port1119
using nntps (nntp over TLS).lantern.koldfront
and post a follow up to the article.Or, you can fill in this form: