The Haskell 98 Library Report: List Utilities

によると,

nub              :: (Eq a) => [a] -> [a]
nub []           =  []
nub (x:xs)       =  x : nub (filter (y -> not (x == y)) xs)

なので,実はnubはO(n^2). Eq だけ,だから仕方ないのかもしれないが.

しかし,

nub :: (Ord a) => [a] -> [a]

にすれば,O(n log n) にできる.Setのmember, insertは log n であることを利用する.

import Data.Set (empty, member, insert)
nub' :: (Ord a) => [a] -> [a]
nub' l = nub'' l empty
where nub'' [] _         = []
nub'' (x:xs) s
| x `member` s = nub'' xs s
| otherwise    = x : nub'' xs (x `insert` s)

O(n^2) と O(n log n) では大違いです. cf. sort

ところで,ソース

http://haskell.org/ghc/docs/latest/html/libraries/base/src/Data-List.html#nub

を見てみたら,#ifdef USE_REPORT_PRELUDE なんて文がある.どうやらnubの実装には2種類あるみたい.

しかし,自分のghcはどちらを使っているのか?

nub                     :: (Eq a) => [a] -> [a]
#ifdef USE_REPORT_PRELUDE
nub                     =  nubBy (==)
#else
-- stolen from HBC
nub l                   = nub' l []             -- '
where
nub' [] _           = []                    -- '
nub' (x:xs) ls                              -- '
| x `elem` ls   = nub' xs ls            -- '
| otherwise     = x : nub' xs (x:ls)    -- '
#endif
-- | The 'nubBy' function behaves just like 'nub', except it uses a
-- user-supplied equality predicate instead of the overloaded '=='
-- function.
nubBy                   :: (a -> a -> Bool) -> [a] -> [a]
#ifdef USE_REPORT_PRELUDE
nubBy eq []             =  []
nubBy eq (x:xs)         =  x : nubBy eq (filter ( y -> not (eq x y)) xs)
#else
nubBy eq l              = nubBy' l []
where
nubBy' [] _         = []
nubBy' (y:ys) xs
| elem_by eq y xs = nubBy' ys xs
| otherwise       = y : nubBy' ys (y:xs)
-- Not exported:
-- Note that we keep the call to `eq` with arguments in the
-- same order as in the reference implementation
-- 'xs' is the list of things we've seen so far, 
-- 'y' is the potential new element
elem_by :: (a -> a -> Bool) -> a -> [a] -> Bool
elem_by _  _ []         =  False
elem_by eq y (x:xs)     =  y `eq` x || elem_by eq y xs
#endif

# log は定数だから, O(n log n) => O(n).やったリニアだ.