Thoughts on shared memory parallelism

this is more to remind myself of what I am thinking about right now. I should be thinking about Advance Topics in High Performance Computing as the exam is tomorrow but I find myself wandering onto  - admittedly more advance topics – non examinable material.

So my current thoughts are around the future say next 3 to 5 yrs which would be a nice time to do a PhD, well from currently working on some OpenCL benchmarking; I figure that its all nice that you can manually schedule tasks to be performed on hardware co-processors (GPGPU) but that’s like pthreads and what we really want is OpenMP! That is think about the problem and not the mechanics.

So why is this important, well current GPGPU programming has a problem, a normal HPC type problem … the interconnect. PCI express is a point to point communications channel running at comparable snails pace to that of the CPU and Memory so squeezing data across that is basically a bad idea for HPC … fine for games and video. Why would anyone change this, well power consumption would be a good reason, the latest Intel Core I5’s have GPUs on the same physical chip package as the CPU – to save power and drive low cost notebooks. its currently separate silicon but could in theory be interconnected with a fast bus; QPI on Intel or much better Hyper Transport! Only problem is that it would appear from Intel’s documentation that the onboard GMA’s are connected by PCIe dohh! never mind they were GMAs which are _not_ the best.

I did go off on a mild tangent thinking about doing an Open Source HW ASIC and Supercomputer blade that could scale not dissimilar to a Cray SeaStar+ design. There are Open Source Hyper-Transport cores, Open RISC or SPARC processors cores and appropriate 1 or 10G Ethernet IP – though of course we are looking to Light Peak as an interconnect. Could easily get a design and built then get a factory in China to pump them out on demand. the rest of the cabinets, cooling and data centre stuff .. er someone else can deal with that. just build MPI right in at the core.

So back to AMD and fusion, that being AMD’s next gen processors. there is no fixed public plan that I currently know of – at least not one that cant be changed – but basically a new x86-64 core ‘bulldozer’ will replace the K10 core currently used in Opteron and it will go 8/12/16/24 core but they will also bring onboard GPUs based on the Radeon GPU technology either on the same package or the same die. more over it better be connected by Hyper-Transport, sooner rather than later.

what about nvidia? well just got my fermi card (at least awaiting the courier – due today). hot hot but the right idea, fermi on hyper-transport? or cheaper to just buy out AMD?

Anyways back to the beginning, we want a way to not think about the layer of abstraction but just to be able to choose to run code on various accelerators but do it without the plumbing. so I have found two papers from last years International workshop on OpenMP . two papers of particular note were A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures  and Can OpenMP be extended to deal with Hardware Accelerator? the first paper next steps would be a great place to start build and OpenCL driver version. wonder if I could do a  PhD in that?

anyways back to study :(


Current interests ... and PhD?

So last week I went to the Turing Lectures, this year given by Prof. Chris Bishop. [ http://research.microsoft.com/en-us/um/people/cmbishop/ ]. Hey next year its Donald Knuth !!!

Prof. Bishop is nice man, I even had dinner with him afterwards. Intersting though was his take on physics, maths and flying - epsecially abuse of airbus A320 simultators but its his story and if you meet him ask about it.

Anyways, so I am currently of the mood and approach to do a PhD but in what ????

Obviously during my MSc in High Performance Computing we have been dealing with Scientific computational problems. So something along those lines ought to be appropriate. Prof Bishops lecture was on the third generation of AI or Computer Learning and discussed what, how and where it has been applied. Also that they have produced a research tool call Infer.Net [ http://research.microsoft.com/infernet ]. I currently am interested in GPGPU or more specifically OpenCL based Hardware acceleration using GPU, FPGA, Cell or what ever. However data driven applications, models (HPC) and scalable computing platforms appeal.

Howz about a an engine that trawls the chemical databse looking for matches to receptors on disease molecules etc. things that are massively computational, data driven and more importantly for me something that could be done in my prefered view of the software world - but more on that whne my dissertatio is complete. I'm not sure how much I can say as I dont want the plagerisim software to strike me off be fore i get started but Mono/.Net is the way forward and as a final interesting piece consider this Cray dropped Sisal (Streams and Iterations in a Single Assignement Language) in the mid-90s coz they stopped producing vector processors - I think its a nice fit for GPGPUs SISAL# but there may be a better way ... I did find BSGP Bulk-Synchronous GPU Programming [ http://www.kunzhou.net/#BSGP ]  a very interesting concept and might have to be re-examined.

Right anyways, real life says 'just back from digging the allotment' and the G&T needs 'refreshing' and I really need to fix the comments on this BlogEngine!!

keepin it Scientific like,

S

PS: hopefully going to the Cray Users Group in Edinburgh this year - waaaaahey [ http://www.cug.org/ ]

 


dohh

Ok, so untill I work out how to get rid of the spam sutff.... probably an upgrade ... more to do Cry

 

right so I have been busy but probbaly starting to get a breath and have my dissertation to do,...

the current status is OpenMP = Shared Memory and nicer than threads but different target

MPI = distributed memory = cray xt 4 = hector

guys, guys guys, latex doesnt do blogging !!!

NOBODY send me a link to a latex based blog engine, I wont even google it !!

right I am currently watching http://microsoftpdc.com/Sessions/P09-17

but porting http://www2.epcc.ed.ac.uk/computing/research_activities/java_grande/index_1.html section 2 computation kernels to c# mono 2.6.1 with OpenCL.Net from http://www.hoopoe-cloud.com/Solutions/OpenCL.NET but can they easily be sacled in parallel via http://osl.iu.edu/research/mpi.net/

 

oh those I drink with are going to give me grief but hey ho this is 'a log a rythmic type of jazz, yeah'


About Me

Stuart Fraser is a 20 year veteran of the 'Data Systems Department', going back to University. I might have some thoughts.

Recent posts

Recent comments

Search

Categories

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2010