2009 | ||
---|---|---|
125 | George Michelogiannakis, James D. Balfour, William J. Dally: Elastic-buffer flow control for on-chip networks. HPCA 2009: 151-162 | |
124 | Nan Jiang, John Kim, William J. Dally: Indirect adaptive routing on large scale interconnection networks. ISCA 2009: 220-231 | |
123 | Daniel U. Becker, William J. Dally: Allocator implementations for network-on-chip routers. SC 2009 | |
122 | George Michelogiannakis, William J. Dally: Router designs for elastic buffer on-chip networks. SC 2009 | |
121 | John Kim, William J. Dally, Steve Scott, Dennis Abts: Cost-Efficient Dragonfly Topology for Large-Scale Systems. IEEE Micro 29(1): 33-40 (2009) | |
2008 | ||
120 | Abhishek Das, William J. Dally: Stream Scheduling: A Framework to Manage Bulk Operations in Memory Hierarchies. Euro-Par 2008: 337-349 | |
119 | John Kim, William J. Dally, Steve Scott, Dennis Abts: Technology-Driven, Highly-Scalable Dragonfly Topology. ISCA 2008: 77-88 | |
118 | Manman Ren, Ji Young Park, Mike Houston, Alex Aiken, William J. Dally: A tuning framework for software-managed memory hierarchies. PACT 2008: 280-291 | |
117 | Mike Houston, Ji Young Park, Manman Ren, Timothy J. Knight, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan: A portable runtime interface for multi-level memory hierarchies. PPOPP 2008: 143-152 | |
116 | William J. Dally, James D. Balfour, David Black-Schaffer, James Chen, R. Curtis Harting, Vishal Parikh, JongSoo Park, David Sheffield: Efficient Embedded Computing. IEEE Computer 41(7): 27-32 (2008) | |
2007 | ||
115 | JongSoo Park, Sung-Boem Park, James D. Balfour, David Black-Schaffer, Christos Kozyrakis, William J. Dally: Register pointer architecture for efficient embedded processors. DATE 2007: 600-605 | |
114 | William J. Dally: Interconnect-Centric Computing. HPCA 2007: 1 | |
113 | Jung Ho Ahn, Mattan Erez, William J. Dally: Tradeoff between data-, instruction-, and thread-level parallelism in stream processors. ICS 2007: 126-137 | |
112 | Mattan Erez, Jung Ho Ahn, Jayanth Gummaraju, Mendel Rosenblum, William J. Dally: Executing irregular scientific applications on stream architectures. ICS 2007: 93-104 | |
111 | John Kim, William J. Dally, Dennis Abts: Flattened butterfly: a cost-efficient topology for high-radix networks. ISCA 2007: 126-137 | |
110 | Shekhar Borkar, William J. Dally: Future of on-chip interconnection architectures. ISLPED 2007: 122 | |
109 | John Kim, James D. Balfour, William J. Dally: Flattened Butterfly Topology for On-Chip Networks. MICRO 2007: 172-182 | |
108 | William J. Dally: Enabling Technology for On-Chip Interconnection Networks. NOCS 2007: 3 | |
107 | Jayanth Gummaraju, Mattan Erez, Joel Coburn, Mendel Rosenblum, William J. Dally: Architectural Support for the Stream Execution Model on General-Purpose Processors. PACT 2007: 3-12 | |
106 | Abhishek Das, William J. Dally: Stream Scheduling: A Framework to Manage Bulk Operations in a Memory Hierarchy. PACT 2007: 405 | |
105 | Timothy J. Knight, Ji Young Park, Manman Ren, Mike Houston, Mattan Erez, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan: Compilation for explicitly managed memory hierarchies. PPOPP 2007: 226-236 | |
104 | John Kim, James D. Balfour, William J. Dally: Flattened Butterfly Topology for On-Chip Networks. Computer Architecture Letters 6(2): 37-40 (2007) | |
103 | John D. Owens, William J. Dally, Ron Ho, D. N. Jayasimha, Stephen W. Keckler, Li-Shiuan Peh: Research Challenges for On-Chip Interconnection Networks. IEEE Micro 27(5): 96-108 (2007) | |
2006 | ||
102 | William J. Dally: Computer Architecture in the Many-Core Era. ICCD 2006 | |
101 | James D. Balfour, William J. Dally: Design tradeoffs for tiled CMP on-chip networks. ICS 2006: 187-198 | |
100 | Steve Scott, Dennis Abts, John Kim, William J. Dally: The BlackWidow High-Radix Clos Network. ISCA 2006: 16-28 | |
99 | Abhishek Das, William J. Dally, Peter R. Mattson: Compiling for stream processing. PACT 2006: 33-42 | |
98 | Thomas L. Sterling, Peter M. Kogge, William J. Dally, Steve Scott, William Gropp, David E. Keyes, Peter H. Beckman: Multi-core issues - Multi-Core for HPC: breakthrough or breakdown? SC 2006: 73 | |
97 | Jung Ho Ahn, Mattan Erez, William J. Dally: Architecture - The design space of data-parallel memory systems. SC 2006: 80 | |
96 | Kayvon Fatahalian, Daniel Reiter Horn, Timothy J. Knight, Larkhoon Leem, Mike Houston, Ji Young Park, Mattan Erez, Manman Ren, Alex Aiken, William J. Dally, Pat Hanrahan: Memory - Sequoia: programming the memory hierarchy. SC 2006: 83 | |
95 | John Kim, William J. Dally, Dennis Abts: Interconnect routing and scheduling - Adaptive routing in high-radix clos network. SC 2006: 92 | |
94 | Amit K. Gupta, William J. Dally: Topology optimization of interconnection networks. Computer Architecture Letters 5(1): 10-13 (2006) | |
93 | Jung Ho Ahn, William J. Dally: Data parallel address architecture. Computer Architecture Letters 5(1): 30-33 (2006) | |
2005 | ||
92 | Andrew Chang, William J. Dally: Explaining the gap between ASIC and custom power: a custom perspective. DAC 2005: 281-284 | |
91 | Jung Ho Ahn, Mattan Erez, William J. Dally: Scatter-Add in Data Parallel Architectures. HPCA 2005: 132-142 | |
90 | John Kim, William J. Dally, Brian Towles, Amit K. Gupta: Microarchitecture of a High-Radix Router. ISCA 2005: 420-431 | |
89 | Mattan Erez, Nuwan Jayasena, Timothy J. Knight, William J. Dally: Fault Tolerance Techniques for the Merrimac Streaming Supercomputer. SC 2005: 29 | |
88 | William J. Dally, Keith Diefendorff: Hot Chips 16: Power, Parallelism, and Memory Performance. IEEE Micro 25(2): 8-9 (2005) | |
2004 | ||
87 | Nuwan Jayasena, Mattan Erez, Jung Ho Ahn, William J. Dally: Stream Register Files with Indexed Access. HPCA 2004: 60-72 | |
86 | Jung Ho Ahn, William J. Dally, Brucek Khailany, Ujval J. Kapasi, Abhishek Das: Evaluating the Imagine Stream Architecture. ISCA 2004: 14-25 | |
85 | Mattan Erez, Jung Ho Ahn, Ankit Garg, William J. Dally, Eric Darve: Analysis and Performance Results of a Molecular Modeling Application on Merrimac. SC 2004: 42 | |
84 | Arjun Singh, William J. Dally, Amit K. Gupta, Brian Towles: Adaptive channel queue routing on k-ary n-cubes. SPAA 2004: 11-19 | |
83 | William J. Dally, Ujval J. Kapasi, Brucek Khailany, Jung Ho Ahn, Abhishek Das: Stream Processors: Progammability and Efficiency. ACM Queue 2(1): 52-62 (2004) | |
82 | Arjun Singh, William J. Dally, Brian Towles, Amit K. Gupta: Globally Adaptive Load-Balanced Routing on Tori. Computer Architecture Letters 3: (2004) | |
2003 | ||
81 | Brucek Khailany, William J. Dally, Scott Rixner, Ujval J. Kapasi, John D. Owens, Brian Towles: Exploring the VLSI Scalability of Stream Processors. HPCA 2003: 153-164 | |
80 | M.-J. Edward Lee, William J. Dally, Ramin Farjad-Rad, Hiok-Tiaq Ng, Ramesh Senthinathan, John H. Edmondson, John Poulton: CMOS High-Speed I/Os - Present and Future. ICCD 2003: 454-461 | |
79 | Arjun Singh, William J. Dally, Amit K. Gupta, Brian Towles: GOAL: A Load-Balanced Adaptive Routing Algorithm for Torus Networks. ISCA 2003: 194-205 | |
78 | William J. Dally, Francois Labonte, Abhishek Das, Pat Hanrahan, Jung Ho Ahn, Jayanth Gummaraju, Mattan Erez, Nuwan Jayasena, Ian Buck, Timothy J. Knight, Ujval J. Kapasi: Merrimac: Supercomputing with Streams. SC 2003: 35 | |
77 | Brian Towles, William J. Dally, Stephen P. Boyd: Throughput-centric routing algorithm design. SPAA 2003: 200-209 | |
76 | Ujval J. Kapasi, Scott Rixner, William J. Dally, Brucek Khailany, Jung Ho Ahn, Peter R. Mattson, John D. Owens: Programmable Stream Processors. IEEE Computer 36(8): 54-62 (2003) | |
75 | Brian Towles, William J. Dally: Guaranteed scheduling for switches with configuration overhead. IEEE/ACM Trans. Netw. 11(5): 835-847 (2003) | |
2002 | ||
74 | Amit K. Gupta, William J. Dally, Arjun Singh, Brian Towles: Scalable Opto-Electronic Network (SOENet). Hot Interconnects 2002: 71-76 | |
73 | Ujval J. Kapasi, William J. Dally, Scott Rixner, John D. Owens, Brucek Khailany: The Imagine Stream Processor. ICCD 2002: 282-288 | |
72 | Brucek Khailany, William J. Dally, Andrew Chang, Ujval J. Kapasi, Jinyung Namkoong, Brian Towles: VLSI Design and Verification of the Imagine Processor. ICCD 2002: 289-294 | |
71 | John D. Owens, Scott Rixner, Ujval J. Kapasi, Peter R. Mattson, Brian Towles, Ben Serebrin, William J. Dally: Media Processing Applications on the Imagine Stream Processor. ICCD 2002: 295-302 | |
70 | Ben Serebrin, John D. Owens, Chen H. Chen, Stephen P. Crago, Ujval J. Kapasi, Peter R. Mattson, Jinyung Namkoong, Scott Rixner, William J. Dally: A Stream Processor Development Platform. ICCD 2002: 303- | |
69 | Brian Towles, William J. Dally: Guaranteed Scheduling for Switches with Configuration Overhead. INFOCOM 2002 | |
68 | Brian Towles, William J. Dally: Worst-case traffic for oblivious routing functions. SPAA 2002: 1-8 | |
67 | Arjun Singh, William J. Dally, Brian Towles, Amit K. Gupta: Locality-preserving randomized oblivious routing on torus networks. SPAA 2002: 9-13 | |
66 | K. A. Shaw, William J. Dally: Migration in Single Chip Multiprocessors. Computer Architecture Letters 1: (2002) | |
65 | Brian Towles, William J. Dally: Worst-case Traffic for Oblivious Routing Functions. Computer Architecture Letters 1: (2002) | |
2001 | ||
64 | William J. Dally, Brian Towles: Route Packets, Not Wires: On-Chip Interconnection Networks. DAC 2001: 684-689 | |
63 | Li-Shiuan Peh, William J. Dally: A Delay Model and Speculative Architecture for Pipelined Routers. HPCA 2001: 255-266 | |
62 | P. Chiang, William J. Dally, E. Lee: Monolithic chaotic communications system. ISCAS (3) 2001: 325-328 | |
61 | Li-Shiuan Peh, William J. Dally: A Delay Model for Router Microarchitectures. IEEE Micro 21(1): 26-34 (2001) | |
60 | William J. Dally, Marc Tremblay, Allen J. Baum: Guest Editors' Introduction: Hot Chips 12. IEEE Micro 21(2): 13-15 (2001) | |
59 | Brucek Khailany, William J. Dally, Ujval J. Kapasi, Peter R. Mattson, Jinyung Namkoong, John D. Owens, Brian Towles, Andrew Chang, Scott Rixner: Imagine: Media Processing with Streams. IEEE Micro 21(2): 35-46 (2001) | |
2000 | ||
58 | Peter R. Mattson, William J. Dally, Scott Rixner, Ujval J. Kapasi, John D. Owens: Communication Scheduling. ASPLOS 2000: 82-92 | |
57 | William J. Dally, Andrew Chang: The role of custom design in ASIC Chips. DAC 2000: 643-647 | |
56 | Scott Rixner, William J. Dally, Brucek Khailany, Peter R. Mattson, Ujval J. Kapasi, John D. Owens: Register Organization for Media Processing. HPCA 2000: 375-386 | |
55 | Li-Shiuan Peh, William J. Dally: Flit-Reservation Flow Control. HPCA 2000: 73-84 | |
54 | Scott Rixner, William J. Dally, Ujval J. Kapasi, Peter R. Mattson, John D. Owens: Memory access scheduling. ISCA 2000: 128-138 | |
53 | Ken Mai, Tim Paaske, Nuwan Jayasena, Ron Ho, William J. Dally, Mark Horowitz: Smart Memories: a modular reconfigurable architecture. ISCA 2000: 161-171 | |
52 | Nicholas P. Carter, William J. Dally, Whay Sing Lee, Stephen W. Keckler, Andrew Chang: Processor Mechanisms for Software Shared Memory. ISHPC 2000: 120-133 | |
51 | Ujval J. Kapasi, William J. Dally, Scott Rixner, Peter R. Mattson, John D. Owens, Brucek Khailany: Efficient conditional operations for data-parallel architectures. MICRO 2000: 159-170 | |
1999 | ||
50 | William J. Dally, Steve Lacy: VLSI Architecture: Past, Present, and Future. ARVLSI 1999: 232-241 | |
49 | Stephen W. Keckler, Andrew Chang, Whay Sing Lee, Sandeep Chatterjee, William J. Dally: Concurrent Event Handling through Multithreading. IEEE Trans. Computers 48(9): 903-916 (1999) | |
1998 | ||
48 | William J. Dally, Linda Chao, Andrew A. Chien, Soha Hassoun, Waldemar Horwat, Jon Kaplan, Paul Song, Brian Totty, D. Scott Wills: Architecture of a Message-Driven Processor. 25 Years ISCA: Retrospectives and Reprints 1998: 337-344 | |
47 | William J. Dally, Andrew A. Chien, Stuart Fiske, Waldemar Horwat, Richard A. Lethin, Michael D. Noakes, Peter R. Nuth, Ellen Spertus, Deborah A. Wallach, D. Scott Wills, Andrew Chang, John S. Keen: Retrospective: the J-machine. 25 Years ISCA: Retrospectives and Reprints 1998: 54-58 | |
46 | Stephen W. Keckler, William J. Dally, Daniel Maskit, Nicholas P. Carter, Andrew Chang, Whay Sing Lee: Exploiting Fine-grain Thread Level Parallelism on the MIT Multi-ALU Processor. ISCA 1998: 306-317 | |
45 | Scott Rixner, William J. Dally, Ujval J. Kapasi, Brucek Khailany, Abelardo López-Lagunas, Peter R. Mattson, John D. Owens: A Bandwidth-efficient Architecture for Media Processing. MICRO 1998: 3-13 | |
44 | J. P. Grossman, William J. Dally: Point Sample Rendering. Rendering Techniques 1998: 181-192 | |
43 | Whay Sing Lee, William J. Dally, Stephen W. Keckler, Nicholas P. Carter, Andrew Chang: An Efficient, Protected Message Interface. IEEE Computer 31(11): 69-75 (1998) | |
1997 | ||
42 | John S. Keen, William J. Dally: Extended Ehemeral Logging: Log Storage Management for Applications with Long Lived Transactions. ACM Trans. Database Syst. 22(1): 1-42 (1997) | |
1995 | ||
41 | Larry R. Dennison, William J. Dally, Thucydides Xanthopoulos: Low-latency plesiochronous data retiming. ARVLSI 1995: 304-315 | |
40 | Stuart Fiske, William J. Dally: Thread Prioritization: A Thread Scheduling Mechanism for Multiple-Context Parallel Processors. HPCA 1995: 210-221 | |
39 | Peter R. Nuth, William J. Dally: The Named-State Register File: Implementation and Performance. HPCA 1995: 4-13 | |
38 | Marco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich, Whay Sing Lee: The M-Machine multicomputer. MICRO 1995: 146-156 | |
37 | Ellen Spertus, William J. Dally: Evaluating the Locality Benefits of Active Messages. PPOPP 1995: 189-198 | |
1994 | ||
36 | Nicholas P. Carter, Stephen W. Keckler, William J. Dally: Hardware Support for Fast Capability-based Addressing. ASPLOS 1994: 319-327 | |
35 | John S. Keen, William J. Dally: XEL: Extended Ephemeral Logging for Log Storage Management. CIKM 1994: 312-321 | |
34 | William J. Dally, Larry R. Dennison, David Harris, Kinhong Kan, Thucydides Xanthopoulos: The Reliable Router: A Reliable and High-Performance Communication Substrate for Parallel Computers. PCRCW 1994: 241-255 | |
1993 | ||
33 | Michael D. Noakes, Deborah A. Wallach, William J. Dally: The J-Machine Multicomputer: An Architectural Evaluation. ISCA 1993: 224-235 | |
32 | Ellen Spertus, Seth Copen Goldstein, Klaus E. Schauser, Thorsten von Eicken, David E. Culler, William J. Dally: Evaluation of Mechanisms for Fine-Grained Parallel Programs in the J-Machine and the CM-5. ISCA 1993: 302-313 | |
31 | John S. Keen, William J. Dally: Performance Evaluation of Ephemeral Logging. SIGMOD Conference 1993: 187-196 | |
30 | William J. Dally, Hiromichi Aoki: Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels. IEEE Trans. Parallel Distrib. Syst. 4(4): 466-475 (1993) | |
29 | William J. Dally: A Universal Parallel Computer Architecture. New Generation Comput. 11(3): 227-249 (1993) | |
1992 | ||
28 | William J. Dally: A Universal Parallel Computer Architecture. FGCS 1992: 746-758 | |
27 | William J. Dally, Andrew A. Chien, Stuart Fiske, Greg Fyler, Waldemar Horwat, John S. Keen, Richard A. Lethin, Michael D. Noakes, Peter R. Nuth, D. Scott Wills: The Message Driven Processor: An Integrated Multicomputer Processing Element. ICCD 1992: 416-419 | |
26 | Peter R. Nuth, William J. Dally: The J-Machine Network. ICCD 1992: 420-423 | |
25 | Richard A. Lethin, William J. Dally: MDP Design Tools and Methods. ICCD 1992: 424-428 | |
24 | Stephen W. Keckler, William J. Dally: Processor Coupling: Integrating Compile Time and Runtime Scheduling for Parallelism. ISCA 1992: 202-213 | |
23 | William J. Dally: A Fast Translation Method for Paging on top of Segmentation. IEEE Trans. Computers 41(2): 247-250 (1992) | |
22 | William J. Dally: Virtual-Channel Flow Control. IEEE Trans. Parallel Distrib. Syst. 3(2): 194-205 (1992) | |
1991 | ||
21 | Peter R. Nuth, William J. Dally: A Mechanism for Efficient Context Switching. ICCD 1991: 301-304 | |
20 | Ellen Spertus, William J. Dally: Experiences Implementing Dataflow on a General-Purpose Parallel Computer. ICPP (2) 1991: 231-235 | |
19 | William J. Dally: Express Cubes: Improving the Performance of k-Ary n-Cube Interconnection Networks. IEEE Trans. Computers 40(9): 1016-1023 (1991) | |
1990 | ||
18 | William J. Dally: Virtual-Channel Flow Control. ISCA 1990: 60-68 | |
17 | Andrew A. Chien, William J. Dally: Concurrent Aggregates (CA). PPOPP 1990: 187-196 | |
16 | William J. Dally: Performance Analysis of k-Ary n-Cube Interconnection Networks. IEEE Trans. Computers 39(6): 775-785 (1990) | |
15 | Prathima Agrawal, William J. Dally: A hardware logic simulation system. IEEE Trans. on CAD of Integrated Circuits and Systems 9(1): 19-29 (1990) | |
1989 | ||
14 | William J. Dally: Micro-Optimization of Floating Point Operations. ASPLOS 1989: 283-289 | |
13 | Prathima Agrawal, R. Tutundjian, William J. Dally: Algorithms for Accuracy Enhancement in a Hardware Logic Simulator. DAC 1989: 645-648 | |
12 | William J. Dally, Andrew A. Chien, Stuart Fiske, Waldemar Horwat, John S. Keen, Michael Larivee, Richard A. Lethin, Peter R. Nuth, D. Scott Wills: The J-Machine: A Fine-Gain Concurrent Computer. IFIP Congress 1989: 1147-1153 | |
11 | William J. Dally, D. Scott Wills: Universal Mechanisms for Concurrency. PARLE (1) 1989: 19-33 | |
10 | Waldemar Horwat, Andrew A. Chien, William J. Dally: Experience with CST: Programming and Implementation. PLDI 1989: 101-109 | |
9 | William J. Dally, Andrew A. Chien: Object-oriented concurrent programming in CST. SIGPLAN Notices 24(4): 28-31 (1989) | |
1988 | ||
8 | William J. Dally: Mechanisms for Concurrent Computing. FGCS 1988: 154-156 | |
7 | Stuart Fiske, William J. Dally: The Reconfigurable Arithmetic Processor. ISCA 1988: 30-36 | |
1987 | ||
6 | Prathima Agrawal, William J. Dally, Ahmed K. Ezzat, W. C. Fischer, H. V. Jagadish, A. S. Krishnakumar: Architecture and Design of the MARS Hardware Accelerator. DAC 1987: 101-107 | |
5 | William J. Dally, Linda Chao, Andrew A. Chien, Soha Hassoun, Waldemar Horwat, Jon Kaplan, Paul Song, Brian Totty, D. Scott Wills: Architecture of a Message-Driven Processor. ISCA 1987: 189-196 | |
4 | William J. Dally, Charles L. Seitz: Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. IEEE Trans. Computers 36(5): 547-553 (1987) | |
1986 | ||
3 | William J. Dally, Charles L. Seitz: The Torus Routing Chip. Distributed Computing 1(4): 187-196 (1986) | |
1985 | ||
2 | William J. Dally, James T. Kajiya: An Object Oriented Architecture. ISCA 1985: 154-161 | |
1 | William J. Dally, Randal E. Bryant: A Hardware Architecture for Switch-Level Simulation. IEEE Trans. on CAD of Integrated Circuits and Systems 4(3): 239-250 (1985) |