Hi John,
this is to track the bug reported by Joey at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=624389 here, to ensure it is not forgotten. For your convenience, here is the original report:
For quite a while I have been using missingh's pipeBoth with success; but as
soon as my program was rebuilt with ghc 7, it started stalling when large
quantities of data needed to be passed through the pipe.
Here is a simple test case. It needs to run in a git repository.
import System.Cmd.Utils
main = do
as <- checkAttr "blah" $ map show [1..100000]
sequence $ map (putStrLn . show) as
checkAttr attr files = do
(_, s) <- pipeBoth "git" params $ unlines files
return $ lines s
where
params = ["check-attr", attr, "--stdin"]
It queries git for attribute values for 100000 files. With ghc 6, it
should run to completion. With ghc 7, it stalls, deadlocked, after
a varying number of files, under 1000:
select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0})
write(1, "\"701: blah: unspecified\"\n", 25"701: blah: unspecified") = 25
select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0})
write(1, "\"702: blah: unspecified\"\n", 25"702: blah: unspecified") = 25
select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0})
write(1, "\"703: blah: unspecified\"\n", 25"703: blah: unspecified") = 25
select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0})
write(1, "\"704: blah: unspecified\"\n", 25"704: blah: unspecified") = 25
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
sigreturn() = ? (mask now [])
gettimeofday({1303958431, 174266}, NULL) = 0
select(7, [], [6], NULL, {0, 0}) = 1 (out [6], left {0, 0})
write(6, "345\n15346\n15347\n15348\n15349\n1535"..., 8096
The program is blocked trying to write to git-check-attr, and
git-check-attr is in turn blocked waiting for its output to be read.
I've skipping over missingh and filing this bug directly on ghc because
I think it's unlikely missingh is at fault. IIRC, pipeBoth works by
sparking off a helper thread, which is used to write input to a command.
Unless it made a bad assumption about that being a safe thing to do,
this must be a bug in GHC?
FWIW, I have worked around this in my code by forking a process, not a
thread, to do the writing. Which works fine, just a little more
heavyweight than needed. I'm concerned about all the other potential
callers of pipeBoth out there, however.
Hi John,
this is to track the bug reported by Joey at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=624389 here, to ensure it is not forgotten. For your convenience, here is the original report:
For quite a while I have been using missingh's pipeBoth with success; but as
soon as my program was rebuilt with ghc 7, it started stalling when large
quantities of data needed to be passed through the pipe.
Here is a simple test case. It needs to run in a git repository.
It queries git for attribute values for 100000 files. With ghc 6, it
should run to completion. With ghc 7, it stalls, deadlocked, after
a varying number of files, under 1000:
The program is blocked trying to write to git-check-attr, and
git-check-attr is in turn blocked waiting for its output to be read.
I've skipping over missingh and filing this bug directly on ghc because
I think it's unlikely missingh is at fault. IIRC, pipeBoth works by
sparking off a helper thread, which is used to write input to a command.
Unless it made a bad assumption about that being a safe thing to do,
this must be a bug in GHC?
FWIW, I have worked around this in my code by forking a process, not a
thread, to do the writing. Which works fine, just a little more
heavyweight than needed. I'm concerned about all the other potential
callers of pipeBoth out there, however.