generalized fold (and other newbie question)
Johannes Waldmann
waldmann at imn.htwk-leipzig.de
Mon Oct 4 12:23:04 EDT 2010
Hi.
What is the accelerate example code for (dense, standard)
array multiplication?
Can I express something like the following,
to get c = a * b (both Array DIM2 Int)
a', b' :: Array DIM3 Int such that
a'!(i,j,k) = a!(i,j) ; b'!(i,j,k) = b!(j,k)
then obtain c as a fold (+) 0 along the middle coordinate
of zipWith (+) a' b'
Two questions from this:
1. I assume the first step can be done with "replicate" but exactly how?
(the API doc is severely missing a formal spec here. and elsewhere)
2. this would need a fold that goes from DIM3 to DIM2,
but I don't see any such function.
Background: I want to find out how to use accelerate+CUDA (on GTX285)
to find matrix interpretations, as in the ICFP 2010 contest, cf.
http://www.imn.htwk-leipzig.de/~waldmann/talk/10/icfp/
The matrices are small, and I want to do some kind of hill climbing.
Ideally, all the "climbing" will be done inside the GPU, and I just want
to start lots of these processes, and collect their results.
(Then re-start, from Haskell land, using the previous results somehow.)
What accelerate control structure would I use
for doing a certain number of hill climbing steps?
In fact, the hill climbing process would compute a list of individuals,
of which I only ever need the most recent/last one.
Can this be done by accelerate? (Will it overwrite
the earlier ones by some symbolic GC magic,
or does it allocate space for all?)
What about random numbers? Should I pass in (from Haskell land)
a list/array of "randoms"?
I guess the alternative is to control the hill climbing from Haskell
land. Then I would use accelerate only to compute matrix products.
This would incur too much communication via the bus?
I totally don't see how accelerate/CUDA will distribute the work.
Each expression (Acc ...) is compiled to run in one kernel?
Then how to I run several?
Yes, that's a lot of questions. I am very willing to invest
some work here - it's not just that I want matrix interpretations;
I also plan to present this stuff (program CUDA from Haskell) in a
lecture. So it should not only work - it should also look
nice and easy ...
Best, Johannes.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://projects.haskell.org/pipermail/accelerate/attachments/20101004/f8baec9f/attachment.pgp
More information about the Accelerate
mailing list