CSC/ECE 506 Spring 2010/summary
<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=windows-1252"><meta name=ProgId content=Word.Document><meta name=Generator content="Microsoft Word 11"><meta name=Originator content="Microsoft Word 11"><link rel=File-List href="Independent%20Study_files/filelist.xml"><title>ECE633 Independent Study: Architecture of Parallel Computers</title><style></style></head><body lang=EN-US link=blue vlink=purple style='tab-interval:.5in'>
ECE633 Independent Study: Architecture of Parallel Computers<o:p></o:p>
Karishma Navalakha<o:p></o:p>
Abstract:<o:p></o:p>
There
has been tremendous research and development in the field of multi-core Architecture in the last decade. In such a dynamic environment it is very difficult to have text books covering latest developments in the field. Wiki written text books comes as an extremely handy tool for students to get acquainted and interested in ongoing research.In this independent study we explored an academic learning technique where students could learn the fundamental concepts of the subject through the text book available to students and lectures delivered by Prof. Gehringer in class. They can now build on this foundation and gather latest information from the varied online resources and technical papers and summarize their findings in the form of wiki pages.Software is also being currently developed to assist the students and was adopted in this course. We tried to enhance the quality of student submitted wiki pages through peer reviewing. Professor Gehringer and I constantly provided inputs to students to improve both their quality of wiki pages as well as quality of reviewing. The software being developed under the able guidance of professor Gehringer has been vital in overcoming administrative hurdles involved in assigning topics to students, maintaining the updates and tracking progress of their writings, getting feedbacks through peer reviewing and handling the re-submitted work. All this has been managed via the software in an organized
fashion.<o:p></o:p>Experience with Wiki written text book:<o:p></o:p>
The
software was first deployed in CSC/ECE 506, Architecture of Parallel Computers. This is a beginning masters-level course that is taken by all Computer Engineering masters students. It is optional for Computer Science students, but as it is one way to fulfill a core requirement, it is popular with them too. The recently adopted textbook for this course is the locally written Fundamentals of Parallel Computer Architecture: Multichip and Multicore Systems [Solihin 2009]. It did not make sense to have the students rewrite this excellent text, but the book concentrates on theory and design fundamentals, without detailed application to current parallel machines. We felt that students would benefit from learning how the principles were applied in current architectures. Furthermore, they would learn about the newest machines in this fast-changing
field.<o:p></o:p>After
every chapter covered in class, two individuals, or pairs of students were required to sign up for writing the wiki supplement for that particular chapter. (That is, we solicited two supplements for each chapter, each of which could be authored by one or two students.) They were asked to add specific
types of information which was not included in the chapter.<o:p></o:p>Initially,
students were not clear about the purpose of their wiki pages. The first pages they wrote had substantial duplication of topics covered in the textbook. Students were attempting to give a complete coverage of issues discussed in the chapter. We wanted them to concentrate instead on recent developments. Upon seeing this, we established the practice of having the first two authors of this paper (Gehringer and Navalakha) review the student work, along with three peer reviews from fellow students. A lot of review time was spent providing
guidance on how to revise.<o:p></o:p>At
the beginning we gave the students complete freedom to explore resources for the topic they had chosen to write on. This was not very successful, as the students seemingly chose to read the first few search hits, which tended to provide an overview of the topic, rather than in-depth information on particular implementations. Sometimes students were not aware that the information they found was already covered in the next chapter, which they have not read yet. The first review which we gave students was mainly just making them aware of topics covered in later chapters. A lot of effort in writing the initial draft was thus wasted. After the first two sets of topics, we began to provide links for students to material that we wanted the students to pay attention to. Gehringer and Navalakha met weekly to discuss what to provide to students. We regularly consulted other textbooks, technology news, and Web sites of major processor manufacturers, such as Intel and AMD. As the semester progressed, the quality of the initial submissions improved, and the students
realized better returns for their effort.<o:p></o:p>The
quality of work seemed to improve as the semester progressed. A comparison of the grades for the wiki pages revealed that the average score for the first chapter written by each student was 82.8% while the average for the second submission was 82.7%. The quality of wiki pages had improved, but at the same time, the peer reviewers became more demanding. Students were given more inputs to improve their work via peer reviewing. Thus the improvement was seen in the final wiki page produced as against the grades received by students. The initial wiki pages provided randomly collected data and was cluttered by diagrams and graphs. This information reinstated facts given in the textbook. The later wiki pages focused on a comparative study of present-day
supercomputers produced by Intel, AMD and IBM.<o:p></o:p>For
example while writing the wiki for cache-coherence protocols, the students examined which protocol was favored by which company and why. They also discussed protocols which have been introduced in recent two years e.g., Intel's MESIF protocol. Such in depth analysis made the wiki more appealing to readers. Gehringer and Navalakha provided additional reviews which helped in constantly improving the quality of wiki pages. These reviews gave the students insight into what was expected expected of them. This led to an increasing focus on current developments while peer reviewing. It was observed that later versions of reviews included guidance similar to that received from Gehringer and Navalakha. The organization of the wiki pages and the volume of relevant
data collected by students improved as the semester progressed.<o:p></o:p>Electronic
peer-review systems have been widely used to review student work, but never before, to our knowledge, have they been applied to assignments consisting of multiple interrelated parts with precedence constraints. The growing interest in large collaborative projects, such as wiki textbooks, has led to a need for electronic support for the process, lest the administrative burden on
instructor and TA grow too large.<o:p></o:p>Chapter wise learning from this independent study:<o:p></o:p>
Chapter 1: <o:p></o:p>
It
covered an interesting topic of supercomputer evolution. Wiki pages written for this topic included a lot data from literature. Students came up with interesting topics which were not covered in the text book such as <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/CSC/ECE_506_Spring_2010/ch1_lm#Timeline_of_supercomputers">Timeline of supercomputers</a>, <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/1.1#First_Supercomputer_.28_ENIAC_.29">First Supercomputer(ENIAC)</a>, <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/1.1#Cray_History">Cray History</a>, <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/1.1#Supercomputer_Hierarchal_Architecture">Supercomputer Hierarchal Architecture</a>, <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/1.1#SuperComputer_Operating_System">Supercomputer Operating System</a>, <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/1.1#Cooling_Supercomputer">Cooling Supercomputer</a> and <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/CSC/ECE_506_Spring_2010/ch1_lm#Processor_Family">Processor Family</a>. From their research we could see the increase in dominance of Intel’s processors in the consumer market. We also conclude that Unix has been the platform for most of these super computers. Massive Parallel Processing (MPP) and Symmetric Multiprocessing (SMP) were the earliest style of widely used multiprocessor machine architectures which was replaced by constellation
computing in the 2000 and currently is dominated by cluster computing.<o:p></o:p>References:<o:p></o:p>
<a href="http://www.top500.org/">http://www.top500.org/</a><o:p></o:p>
<a
href="http://books.google.com/books?id=wx4kNh8ArH8C&pg=PA3&lpg=PA3&dq=evolution+of+supercomputers&source=bl&ots=7DVWaEYsZ4&sig=WKRWRuqtM-UfPoB-Wdka5ZWTgng&hl=en&ei=xAleS-TmDpqutgfcj_2jAg&sa=X&oi=book_result&ct=result&resnum=1&ved=0CAoQ6AEwADgK#v=onepage&q=evolution20supercomputers&f=false" title="http://books.google.com/books?id=wx4kNh8ArH8C&pg=PA3&lpg=PA3&dq=evolution+of+supercomputers&source=bl&ots=7DVWaEYsZ4&sig=WKRWRuqtM-UfPoB-Wdka5ZWTgng&hl=en&ei=xAleS-TmDpqutgfcj_2jAg&sa=X&oi=book_result&ct=result&resnum=1&ved=0CAoQ6AEwADgK#v=onepage&q=evolu">The future of supercomputing: an interim report By National Research Council (U.S.).
Committee on the Future of Supercomputing</a><o:p></o:p>Chapter 2: <o:p></o:p>
Data Parallel Programming: The
students provided comparisons between data parallelism and task parallelism. <a href="http://pg-server.csc.ncsu.edu/mediawiki/index.php/CSC/ECE_506_Spring_2010/ch_2_maf#References">Haveraaen (2000)</a> notes that data parallel codes typically bear a strong resemblance to sequential codes, making them easier to read and write. Students noted that the data parallel model may be used with the shared memory or the message passing model without conflict. In their comparisons they concluded combining the data parallel and message passing models results in reduction in the amount and complexity of communication required relative to a task parallel approach. Similarly, combining the data parallel and shared memory models tends to simplify and reduce the amount of synchronization required. SIMD (single-instruction-multiple-data) processors are specifically designed to run data parallel algorithms. Modern examples include CUDA processors developed by nVidia and Cell processors developed by STI (Sony, Toshiba, and
IBM).<o:p></o:p>References: <o:p></o:p>
<![if !supportLists]>1.
<![endif]>W. Daniel Hillis and Guy L. Steele, Jr., <a href="http://portal.acm.org/citation.cfm?id=7903" title="http://portal.acm.org/citation.cfm?id=7903">"Data parallel algorithms,"</a> Communications of the
ACM, 29(12):1170-1183, December 1986.<o:p></o:p><![if !supportLists]>2.
<![endif]>Alexander C. Klaiber and Henry M. Levy, <a href="http://portal.acm.org/citation.cfm?id=192020" title="http://portal.acm.org/citation.cfm?id=192020">"A comparison of message passing and shared memory architectures for data parallel programs,"</a> in Proceedings of the 21st Annual International Symposium on Computer Architecture, April 1994, pp.
94-105.<o:p></o:p>Chapter3: <o:p></o:p>
In this wiki supplement, the three kinds of parallelisms,
i.e. DOALL, DOACROSS and DOPIPE were discussed. These three parallelism techniques were discussed with examples in the form of Open MP code as discussed in the text book. Besides the students provided additional depth in this topic by discussing parallel_for, parallel_reduce,parallel_scan,pipeline,Reduction,DOALL,DOACROSS, DOPIPE with respect to Intel Thread Building Blocks. They also compared DOPIPE, DOACROSS, DOALL in POSIX Threads. Finally they conclude : Pthreads works for all the parallelism and could express functional parallelism easily, but it needs to build specialized synchronization primitives and explicitly privatize variables, makes it more
effort needed to switch a serial program in to parallel mode.<o:p></o:p>OpenMP can provide many performance enhancing features,
such as atomic, barrier and flush synchronization primitives. It is very simple to use OpenMP to exploit DOALL parallelism, but the syntax for expressing
functional parallelism is awkward.<o:p></o:p>Intel TBB relies on generic programming, it performs better
with custom iteration spaces or complex reduction operations. Also, it provides generic parallel patterns for parallel while-loops, data-flow pipeline models, parallel sorts and prefixes, so it's better in cases go beyond loop-based
parallelism.<o:p></o:p>References: <o:p></o:p>
<![if !supportLists]>1.
<![endif]><a href="https://docs.google.com/viewer?a=v&pid=gmail&attid=0.1&thid=126f8a391c11262c&mt=application%2Fpdf&url=https2F2Fmail3Fui26ik26view26th26attid26disp26realattid26zw&sig=AHIEtbTeQDhK98IswmnVSfrPBMfmPLH5Nw" title="https://docs.google.com/viewer?a=v&pid=gmail&attid=0.1&thid=126f8a391c11262c&mt=application%2Fpdf&url=https2F2Fmail3Fui26ik26view26th26attid26disp26realattid%3Df_g602o">An Optimal Abtraction Model
for Hardware Multithreading in Modern Processor Architectures</a><o:p></o:p><![if !supportLists]>2.
<![endif]><a href="http://www.threadingbuildingblocks.org/uploads/81/91/Latest20Source%20Documentation/Reference.pdf" title="http://www.threadingbuildingblocks.org/uploads/81/91/Latest20Source%20Documentation/Reference.pdf">Intel Threading Building
Blocks 2.2 for Open Source Reference Manual</a><o:p></o:p><![if !supportLists]>3.
<![endif]><a href="https://computing.llnl.gov/tutorials/pthreads/#Joining" title="https://computing.llnl.gov/tutorials/pthreads/#Joining">POSIX Threads Programming
by Blaise Barney, Lawrence Livermore National Laboratory</a><o:p></o:p><o:p> </o:p>
Chapter 6:<o:p></o:p>
Cache Structures of
Multi-Core Architectures: Students added additional insight on this topic by discussing Shared Memory Multiprocessors, write policies and replacement policies. Greedy Dual Size (GDS) and Priority Cache(PC) replacement policy was an additional subtopic students threw light on. Students also gave definitions about Trace Cache and Smart Cache techniques by Intel.The most important take away from this topic was how students discussed WRITE POLICIES used in recent multi core architectures. For example, Intel IA 32 IA64 architecture implements Write Combining, Write Collapsing, Weakly Ordered, Uncacheable & Write No Allocate and Non-temporal techniques in its cache. AMD uses cache exclusion unlike Intel’s cache inclusion. Sun's Niagara and SPARC use L1 caches as WT,
with allocate on load and noallocate on stores.<o:p></o:p>References: <o:p></o:p>
<![if !supportLists]>1.
<![endif]> <a href="http://download.intel.com/technology/architecture/sma.pdf"
title="http://download.intel.com/technology/architecture/sma.pdf">http://download.intel.com/technology/architecture/sma.pdf</a><o:p></o:p><![if !supportLists]>2.
<![endif]> <a href="http://www.intel.com/Assets/PDF/manual/248966.pdf"
title="http://www.intel.com/Assets/PDF/manual/248966.pdf">http://www.intel.com/Assets/PDF/manual/248966.pdf</a><o:p></o:p><![if !supportLists]>3.
<![endif]><a href="http://www.intel.com/design/intarch/papers/cache6.pdf"
title="http://www.intel.com/design/intarch/papers/cache6.pdf">http://www.intel.com/design/intarch/papers/cache6.pdf</a><o:p></o:p>Chapter 7: <o:p></o:p>
Shared-memory multiprocessors run into several problems
that are more pronounced than their uniprocessor counterparts. The Solihin text used in this course goes into detail on three of these issues, that is cache coherence, memory consistency and synchronization.The goal of this wiki supplement was to discuss these three issues and also what can be done to ensure that instructions are handled in both a timely and efficient manner and in a manner that is consistent with what the programmer might desire. Memory consistency was discussed by comparing ordering on a uniprocessor vs ordering on a multiprocessor. They concluded that in a multiprocessor much more care must be taken to ensure that all of the loads and stores are committed to memory in a valid order. Synchronization was discussed as applicable to Open MP and fence insertion. Other methods such as test and set method and direct interrupt to another core were also briefly discussed. The programmer (or complier) is responsible for knowing which synchronization directives are available on a given architecture and implementing them in an efficient manner. The students also discussed commonly used instructions for synchronization in popular processor architectures. For example SPARC V8 uses store barrier, Alpha uses memory barrier and write memory barrier whereas Intel x86 uses lfence (load)
sfence (store).<o:p></o:p>References: <o:p></o:p>
<![if !supportLists]>1.
<![endif]> <a href="https://wiki.ittc.ku.edu/ittc/images/0/0f/Loghi.pdf"
title="https://wiki.ittc.ku.edu/ittc/images/0/0f/Loghi.pdf">https://wiki.ittc.ku.edu/ittc/images/0/0f/Loghi.pdf</a><o:p></o:p><![if !supportLists]>2.
<![endif]> <a href="http://portal.acm.org/citation.cfm?id=782854&dl=GUIDE&coll=GUIDE&CFID=84866326&CFTOKEN=84791790"
title="http://portal.acm.org/citation.cfm?id=782854&dl=GUIDE&coll=GUIDE&CFID=84866326&CFTOKEN=84791790">http://portal.acm.org/citation.cfm?id=782854&dl=GUIDE&coll=GUIDE&CFID=84866326&CFTOKEN=84791790</a><o:p></o:p><o:p> </o:p>
Chapter 8:<o:p></o:p>
Students discussed the existing bus-based cache coherence
in real machines. They went ahead and classified the cache coherence protocols based on the year they were introduced and they processors which uses them. MSI protocol was first used in SGI IRIS 4D series. In Synapse protocol M state is called D (Dirty) but works the same as MSI protocol works. MSI has a major drawback in that each read-write sequence incurs 2 bus transactions irrespective of whether the cache line is stored in only one cache or not. The Pentium Pro microprocessor, introduced in 1992 was the first Intel architecture microprocessor to support SMP and MESI. The MESIF protocol, used in the latest Intel multi-core processors was introduced to accommodate the point-to-point links used in the QuickPath Interconnect.MESI came with the drawback of using much time and bandwidth. MOESI was the AMD’s answer to this problem . MOESI' has become one of the most popular snoop-based protocols supported in the AMD64 architecture. The AMD dual-core Opteron can maintain cache coherence in systems up to 8 processors using this protocol. The Dragon Protocol is an update based coherence protocol which does not invalidate other cached copies. The Dragon Protocol , was developed by Xerox Palo Alto Research Center(Xerox PARC), a subsidiary of Xerox Corporation.
This protocol was used in the Xerox PARC Dragon multiprocessor workstation. <o:p></o:p>References:<o:p></o:p>
<![if !supportLists]>1.
<![endif]><a href="http://www.zak.ict.pwr.wroc.pl/nikodem/ak_materialy/Cache20&%20MESI.pdf" title="http://www.zak.ict.pwr.wroc.pl/nikodem/ak_materialy/Cache20&%20MESI.pdf">Cache consistency with MESI
on Intel processor</a><o:p></o:p><![if !supportLists]>2.
<![endif]><a href="http://techreport.com/articles.x/8236/2"
title="http://techreport.com/articles.x/8236/2">AMD dual core Architecture</a><o:p></o:p><![if !supportLists]>3.
<![endif]><a href="http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/stamp/stamp.jsp?tp=&arnumber=4913" title="http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/stamp/stamp.jsp?tp=&arnumber=4913">Silicon Graphics Computer
Systems</a><o:p></o:p><![if !supportLists]>4.
<![endif]><a href="http://portal.acm.org/citation.cfm?id=1499317&dl=GUIDE&coll=GUIDE&CFID=83027384&CFTOKEN=95680533" title="http://portal.acm.org/citation.cfm?id=1499317&dl=GUIDE&coll=GUIDE&CFID=83027384&CFTOKEN=95680533">Synapse tightly coupled
multiprocessors: a new approach to solve old problems</a><o:p></o:p><![if !supportLists]>5.
<![endif]><a href="http://en.wikipedia.org/wiki/Dragon_protocol"
title="http://en.wikipedia.org/wiki/Dragon_protocol">Dragon Protocol</a><o:p></o:p><o:p> </o:p>
Chapter 9:<o:p></o:p>
Synchronization: Students classified synchronization
techniques based on implementation. Hardware synchronization uses locks, barriers and mutual exclusion. Software synchronization examples include ticket locks and queue-based MCS locks.Mutex
implementation uses execution of atomic statements. <o:p></o:p>Some common examples include Test-and-Set, Fetch-and-Increment, Exchange, Compare-and-Swap. Another
type of lock that was not discussed in the text is known as the "Hand-off" lock was discussed in detail by the students. They also discussed reasons why a programmer should attempt to write programs in such a way as to avoid locks. There are API's that exist for parallel architectures that provide specific types of synchronization. If the API are used they way they were design, performance can be maximized while minimizing overhead.Load Locked(LL) and Store Conditional(SC) are a pair of instructions are improved
hardware primitives that are used for lock-free read-modify-write operation.<o:p></o:p>Detailed description of Combining Tree Barrier,
Tournament Barrier and Disseminating Barrier was included. One of the interesting topics discussed in this wiki supplement was the performance evaluation of different barrier implementations. They showed that barrier/centralized blocking barrier does not scale with number of threads and
the contention increases with increase in number of threads.<o:p></o:p>References:<o:p></o:p>
<![if !supportLists]>1.
<![endif]><a
href="http://www2.cs.uh.edu/~hpctools/pub/iwomp-barrier.pdf">http://www2.cs.uh.edu/~hpctools/pub/iwomp-barrier.pdf</a><o:p></o:p><![if !supportLists]>2.
<![endif]><a
href="http://www.statemaster.com/encyclopedia/Deadlock">http://www.statemaster.com/encyclopedia/Deadlock</a><o:p></o:p><![if !supportLists]>3.
<![endif]><a
href="http://www.ukhec.ac.uk/publications/reports/synch_java.pdf">http://www.ukhec.ac.uk/publications/reports/synch_java.pdf</a><o:p></o:p><o:p> </o:p>
Chapter 10: <o:p></o:p>
Students discussed the existing bus-based cache coherence
in real machines. They went ahead and classified the cache coherence protocols based on the year they were introduced and they processors which uses them. MSI protocol was first used in SGI IRIS 4D series. In Synapse protocol M state is called D (Dirty) but works the same as MSI protocol works. MSI has a major drawback in that each read-write sequence incurs 2 bus transactions irrespective of whether the cache line is stored in only one cache or not. The Pentium Pro microprocessor, introduced in 1992 was the first Intel architecture microprocessor to support SMP and MESI. The MESIF protocol, used in the latest Intel multi-core processors was introduced to accommodate the point-to-point links used in the QuickPath Interconnect.MESI came with the drawback of using much time and bandwidth. MOESI was the AMD’s answer to this problem . MOESI' has become one of the most popular snoop-based protocols supported in the AMD64 architecture. The AMD dual-core Opteron can maintain cache coherence in systems up to 8 processors using this protocol. The Dragon Protocol is an update based coherence protocol which does not invalidate other cached copies. The Dragon Protocol , was developed by Xerox Palo Alto Research Center(Xerox PARC), a subsidiary of Xerox Corporation. This protocol was used in the Xerox PARC Dragon multiprocessor workstation.
References: <o:p></o:p><![if !supportLists]>1.
<![endif]><a href="http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf" title="http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf">Shared Memory Consistency
Models</a><o:p></o:p><![if !supportLists]>2.
<![endif]><a href="http://portal.acm.org/citation.cfm?id=193889&dl=GUIDE&coll=GUIDE&CFID=84028355&CFTOKEN=32262273" title="http://portal.acm.org/citation.cfm?id=193889&dl=GUIDE&coll=GUIDE&CFID=84028355&CFTOKEN=32262273">Designing Memory
Consistency Models For Shared-Memory Multiprocessors</a><o:p></o:p><![if !supportLists]>3.
<![endif]><a href="http://cs.gmu.edu/cne/modules/dsm/green/memcohe.html"
title="http://cs.gmu.edu/cne/modules/dsm/green/memcohe.html">Consistency Models</a><o:p></o:p><o:p> </o:p>
Chapter 11:<o:p></o:p>
The cache coherence protocol presented in Chapter 11 of
Solihin 2008 is simpler than most real directory-based protocols. This textbook supplement presents the directory-based protocols used by the DASH multiprocessor and the Alewife multiprocessor. It concludes with an argument of why complexity might be undesirable in cache coherence protocols. The DASH multiprocessor uses a two-level coherence protocol, relying on a snoopy bus to ensure cache coherence within cluster and a directory-based protocol to ensure coherence across clusters. The protocol uses a Remote Access Cache (RAC) at each cluster, which essentially consolidates memory blocks from remote clusters into a single cache on the local snoopy bus. When a request is issued for a block from a remote cluster that is not in the RAC, the request is denied but the request is also forwarded to the owner. The owner supplies the block to the RAC. Eventually, when the requestor retries, the block will be waiting in the RAC. Read and readx operations on a Dash processor were discussed in detail. They also discuss two race conditions which mainly arises on a Dash processor.The first occurs when a Read from requester R is forwarded from home H to owner O, but O sends a Writeback to H before the forwarded Read arrives. Another possible race occurs when the home node H replies with data (ReplyD) to a Read from requester R but an invalidation (Inv) arrives first.LimitLESS is the cache coherence protocol used by the Alewife multiprocessor. Unlike the DASH multiprocessor, the Alewife multiprocessor is not organized into clusters of nodes with local buses, and therefore cache
coherence through the system is maintain through the directory. <o:p></o:p>References: <o:p></o:p>
<![if !supportLists]>1.
<![endif]>Daniel Lenoski, James Laudon, Kourosh Gharachorloo, Anoop Gupta, and John Hennessy (1990). <a href="http://doi.acm.org/10.1145/325164.325132" title="http://doi.acm.org/10.1145/325164.325132">"The directory-based cache coherence protocol for the DASH multiprocessor."</a> In Proceedings of the 17th
Annual International Symposium on Computer Architecture.<o:p></o:p><![if !supportLists]>2.
<![endif]>David Chaiken, John Kubiatowicz, and Anant Agarwal (1991). <a href="http://groups.csail.mit.edu/cag/papers/pdf/asplos4.pdf" title="http://groups.csail.mit.edu/cag/papers/pdf/asplos4.pdf">"LimitLESS directories: A scalable cache coherence
scheme."</a> ACM SIGPLAN Notices.<o:p></o:p>Chapter 12:<o:p></o:p>
Interconnection Networks: Advances in multiprocessors,
parallel computing & networking and parallel computer architectures demand very high performance from interconnection networks. Due to this, interconnection network structure has changed over time, trying to meet higher bandwidths and performance. Students discussed criterion to be considered for choosing the best Network. It included Performance Requirements, Scalability, Incremental expandability, Partitionability, Simplicity, Distance Span, Physical Constraints, Reliability and Reparability, Expected Workloads and Cost Constraints. They provided in depth discussion on Classification of Interconnection networks.Shared-Medium Networks include Token Ring, Token Bus,Backplane Bus. Direct Networks include Mesh, Torus, Hypercube,Tree, Cube-Connected Cycles andde Bruijn and Star Graph Networks.Indirect Networks include Regular Topologies like Crossbar Network and Multistage Interconnection Network and Hybrid Networks such asMultiple Backplane Buses, Hierarchical Networks,Cluster-Based Networks and Hypergraph Topologies. They also discussed routing algorithms and deadlock, starvation and livelock associated with it. These topics were covered in in an extremely detailed way. The students included a diagrammaticrepresentation
for every topology. <o:p></o:p>References: <o:p></o:p>
<![if !supportLists]>1.
<![endif]><a href="http://www.top500.org/2007_overview_recent_supercomputers/sci"
title="http://www.top500.org/2007_overview_recent_supercomputers/sci">http://www.top500.org/2007_overview_recent_supercomputers/sci</a><o:p></o:p><![if !supportLists]>2.
<![endif]><a
href="http://www.cs.nmsu.edu/~pfeiffer/classes/573/notes/topology.html">http://www.cs.nmsu.edu/~pfeiffer/classes/573/notes/topology.html</a><o:p></o:p><o:p> </o:p>
<o:p> </o:p>
Conclusion:<o:p></o:p>
This independent study helped me to increase my knowledge
to a great extent in the field of Architecture of parallel computers. There were 4 students working on every chapter and came up with 2 wiki pages per group. We collected a total of 18 wiki supplements. The data collected was enormous. While reviewing their content I kept updating my knowledge base. I also provided the resources from where they can collect data. This helped me to come across latest developments in the field. Interacting with students helped me to increase my communication skills. Constant discussions with Prof. Gehringer helped me to understand key concepts. This idea of writing wiki supplements got selected for KU Village presentation. I got an opportunity to
present this paper along with Prof. Gehringer. <o:p></o:p><o:p> </o:p>
<o:p> </o:p>
<o:p> </o:p>
<o:p> </o:p>
<o:p> </o:p>
<a href=http://www.milonic.com/>JavaScript Menu Courtesy of Milonic.com</a>
<img src="/images/1x1.gif" width="10" height="1" alt="">
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<a href="/it/software/os/i_love_wiki/index.mpl?">Leggilo in italiano</a> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<script type="text/javascript"></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<a href="http://www.masonhq.com/">Mason</a> |
<img src="/images/1x1.gif" width="8" height="1" alt="">
<script type="text/javascript"></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | ||
Copyright© 1997-2006 Emiliano Bruni | Online from 16/08/1998 with <img src="/cgi-bin/c/Count.cgi?ft=0&df=ebruni.it.dat&comma=T&md=8&pad=T&dd=verdana" alt="" align="bottom"> visitors | Write me to: <img border="0" src="/images/mail.gif" width="77" height="10" align="baseline"> |
<a href=http://www.milonic.com/>JavaScript Menu Courtesy of Milonic.com</a>
<img src="/images/1x1.gif" width="10" height="1" alt="">
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<a href="/it/software/os/i_love_wiki/index.mpl?">Leggilo in italiano</a> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<script type="text/javascript"></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | <img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | |
<a href="http://www.masonhq.com/">Mason</a> |
<img src="/images/1x1.gif" width="8" height="1" alt="">
<script type="text/javascript"></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> |
<img border="0" src="/images/1x1.gif" width="1" height="1" alt=""> | ||
Copyright© 1997-2006 Emiliano Bruni | Online from 16/08/1998 with <img src="/cgi-bin/c/Count.cgi?ft=0&df=ebruni.it.dat&comma=T&md=8&pad=T&dd=verdana" alt="" align="bottom"> visitors | Write me to: <img border="0" src="/images/mail.gif" width="77" height="10" align="baseline"> |
</body> </html>