This publication page was created using our Exhibit tool; if you have a bibtex file you can make one just like it simply by copying a couple of files onto your web server directory. Instructions here
|
The Semantic Web Initiative envisions a Web wherein information is offered free of presentation, allowing more effective exchange and mixing across web sites and across web pages. But without substantial Semantic Web content, few tools will be written to consume it; without many such tools, there is little appeal to publish Semantic Web content.
To break this chicken-and-egg problem, thus enabling more flexible information access, we have created a web browser extension called Piggy Bank that lets users make use of Semantic Web content within Web content as users browse the Web. Wherever Semantic Web content is not available, Piggy Bank can invoke screenscrapers to re-structure information within web pages into Semantic Web format. Through the use of Semantic Web technologies, Piggy Bank provides direct, immediate benefits to users in their use of the existing Web. Thus, the existence of even just a few Semantic Web-enabled sites or a few scrapers already benefits users. Piggy Bank thereby offers an easy, incremental upgrade path to users without requiring a wholesale adoption of the Semantic Web's vision.
To further improve this Semantic Web experience, we have created Semantic Bank, a web server application that lets Piggy Bank users share the Semantic Web information they have collected, enabling collaborative efforts to build sophisticated Semantic Web information repositories through simple, everyday's use of Piggy Bank.
", "pages":"16-27", "date":"2007", "author":["Huynh, David","Mazzochi, Stefano","Karger, David"], "pdfkb":"2400", "volume":"5", "cat":["Haystack","Semantic Web","Information Retrieval"], "journal":"Journal of Web Semantics", "key":"Karger:Piggy-Journal", "year":"2007", "pdf":"http://simile.mit.edu/papers/iswc05.pdf", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#Piggy%20Bank%3A%20Experience%20the%20Semantic%20Web%20Inside%20your%20Web%20Browser" }, {"id":"{DoS:} Fighting Fire with Fire", "label":"{DoS:} Fighting Fire with Fire", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:3c54a05f85d3d7c568ea5d1335211e31", "modified":"no", "crossref":"4th ACM Workshop on Hot Topics in Networks (HotNets)", "abstract":"We consider DoS attacks on servers in which attackers' requests are indistinguishable from legitimate requests. Most current defenses against this class of attack rely on legitimate users in aggregate having more of some resource (CPU cycles, memory cycles, human attention, etc.) than attackers. A server so defended asks prospective clients to prove their legitimacy by spending some of this resource. We adopt this general approach but use bandwidth as the constrained resource. Specifically, we argue that when a server is attacked, it should: (1) prevent overloading by limiting the incoming rate of requests (and dropping all others) and (2) encourage its legitimate clients to fight back with aggressive retransmission. This approach forces all clients to spend bandwidth to receive service, and the legitimate clients, with their greater aggregate bandwidth, will receive the bulk of the service.", "date":"2005-11", "author":["Walfish, Michael","Balakrishnan, Hari","Karger, David","Shenker, Scott"], "venue":"HotNets", "month":"November", "cat":"Systems", "key":"Karger:FightFire", "year":"2005", "pdf":"http://nms.csail.mit.edu/papers/fightfire-hotnets-2005.pdf", "pub-type":"inproceedings", "booktitle":"4th ACM Workshop on Hot Topics in Networks (HotNets)", "address":"College Park, MD", "origin":"http://service.simile-widgets.org/babel/preview#%7BDoS%3A%7D%20Fighting%20Fire%20with%20Fire" }, {"id":"Optimal Route Planning under Uncertainty", "label":"Optimal Route Planning under Uncertainty", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a525b849d287258a451072b5bc94f295", "modified":"no", "abstract":"We present new complexity results and efficient algorithms for optimal route planning in the presence of uncertainty. We employ a decision theoretic framework for defining the optimal route: for a given source S and destination T in the graph, we seek an ST-path of lowest expected cost where the edge travel times are random variables and the cost is a nonlinear function of total travel time. Although this is a natural model for route-planning on real-world road networks, results are sparse due to the analytic difficulty of finding closed form expressions for the expected cost, as well as the computational/combinatorial difficulty of efficiently finding an optimal path which minimizes the expected cost. We identify a family of appropriate cost models and travel time distributions that are closed under convolution and physically valid. We obtain hardness results for routing problems with a given start time and cost functions with a global minimum, in a variety of deterministic and stochastic settings. In general the global cost is not separable into edge costs, precluding classic shortest-path approaches. However, using partial minimization techniques, we exhibit an efficient solution via dynamic programming with low polynomial complexity.", "date":"2006-06", "author":["Nikolova, Evdokia","Brand, Matthew","Karger, David"], "venue":"ICAPS", "month":"June", "cat":"Theory", "key":"Karger:UncertainShortestPaths", "year":"2006", "pdf":"ICAPS0602NikolovaE.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of 2006 International Conference on Automated Planning and Scheduling (ICAPS 2006)", "origin":"http://service.simile-widgets.org/babel/preview#Optimal%20Route%20Planning%20under%20Uncertainty" }, {"id":"Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications", "label":"Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:961ed9f56091191b27637fe9fb5b14dc", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.", "pages":"149--160", "date":"2001-08", "ps":"http://pdos.lcs.mit.edu/papers/chord:sigcomm01/chord_sigcomm.ps", "author":["Stoica, Ion","Morris, Robert","Karger, David","Kaashoek, M. Frans","Balakrishnan, Hari"], "venue":"SIGCOMM", "month":"August", "cat":["Theory","Systems","P2P","Applications of Theory"], "key":"Karger:Chord-sigcomm01", "year":"2001", "pub-type":"inproceedings", "confurl":"http://portal.acm.org/toc.cfm?id=383059&dl=ACM&dl=ACM&type=proceeding&idx=SERIES419&part=Proceedings&WantType=Proceedings", "booktitle":"Proceedings of the {ACM} {SIGCOMM} '01 Conference", "address":"San Diego, California", "origin":"http://service.simile-widgets.org/babel/preview#Chord%3A%20A%20Scalable%20Peer-to-peer%20Lookup%20Service%20for%20Internet%20Applications" }, {"id":"Collaborative Data Analytics with DataHub", "label":"Collaborative Data Analytics with DataHub", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:2f3e7388b086eea71aa688fa3e0585a5", "modified":"no", "pages":"1916--1919", "date":"2015-08", "author":["Bhardwaj, Anant","Deshpande, Amol","Elmore, Aaron J.","Karger, David","Madden, Sam","Parameswaran, Aditya","Subramanyam, Harihar","Wu, Eugene","Zhang, Rebecca"], "url":"http://dx.doi.org/10.14778/2824032.2824100", "doi":"10.14778/2824032.2824100", "issue_date":"August 2015", "acmid":"2824100", "volume":"8", "venue":"VLDB", "publisher":"VLDB Endowment", "month":"August", "cat":["Databases","Visualization"], "journal":"Proc. VLDB Endow.", "key":"Bhardwaj:Datahub", "numpages":"4", "year":"2015", "pub-type":"article", "issn":"2150-8097", "number":"12", "origin":"http://service.simile-widgets.org/babel/preview#Collaborative%20Data%20Analytics%20with%20DataHub" }, {"id":"Haystack: A Platform for Creating, Organizing, and Visualizing Information Using {RDF}", "label":"Haystack: A Platform for Creating, Organizing, and Visualizing Information Using {RDF}", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:2cf6c28c2f592f0c2c5fe03baf5b6831", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Semantic Web Workshop at WWW2002", "abstract":"The Resource Definition Framework (RDF) is designed to support agent communication on the Web, but it is also suitable as a framework for modeling and storing personal information. Haystack is a personalized information repository that employs RDF in this manner. This flexible semistructured data model is appealing for several reasons. First, RDF supports ontologies created by the user and tailored to the user's needs. At the same time, system ontologies can be specified and evolved to support a variety of high-level functionalities such as flexible organization schemes, semantic querying, and collaboration. In addition, we show that RDF can be used to engineer a component architecture that gives rise to a semantically rich and uniform user interface. We demonstrate that by aggregating various types of users' data together in a homogeneous representation, we create opportunities for agents to make more informed deductions in automating tasks for users. Finally, we discuss the implementation of an RDF information store and a programming language specifically suited for manipulating RDF.", "date":"2002-05", "author":["Huynh, David","Karger, David","Quan, Dennis"], "pdfkb":"310", "venue":"WWW", "month":"May", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Haystack-Semweb02", "year":"2002", "pdf":"http://haystack.csail.mit.edu/papers/sww02.pdf", "pub-type":"inproceedings", "confurl":"http://www.www2002.org/", "booktitle":"Semantic Web Workshop at WWW2002", "address":"Honolulu, Hawaii", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20A%20Platform%20for%20Creating%2C%20Organizing%2C%20and%20Visualizing%20Information%20Using%20%7BRDF%7D" }, {"id":"{Web} caching with consistent hashing", "label":"{Web} caching with consistent hashing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:36c0f24b3cd9ae3214dd8fc1010e311e", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "coden":"????", "abstract":"A key performance measure for the World Wide Web is the speed with which content is served to users. As traffic on the Web increases, users are faced with increasing delays and failures in data delivery. Web caching is one of the key strategies that has been explored to improve performance. An important issue in many caching systems is how to decide what is cached where at any given time. Solutions have included multicast queries and directory schemes. In this paper, we offer a new web caching strategy based on consistent hashing. Consistent hashing provides an alternative to multicast and directory schemes, and has several other advantages in load balancing and fault tolerance. Its performance was analyzed theoretically in previous work; in this paper we describe the implementation of a consistent-hashing based system and experiments that support our thesis that it can provide performance improvements. ", "pages":"1203--1213", "date":"1999-05", "author":["Karger, David","Sherman, Alex","Berkheimer, Andy","Bogstad, Bill","Dhanidina, Rizwan","Iwamoto, Ken","Kim, Brian","Matkins, Luke","Yerushalmi, Yoav"], "url":"http://www.elsevier.com/cas/tree/store/comnet/sub/1999/31/11-16/2181.pdf", "volume":"31", "day":"17", "month":"May", "cat":["Systems","Theory","Applications of Theory"], "journal":"Computer Networks", "key":"Karger:ImpWeb-Journal", "year":"1999", "bibdate":"Fri Sep 24 19:43:29 MDT 1999", "pub-type":"article", "issn":"1389-1286", "number":"11--16", "address":"Amsterdam, Netherlands", "origin":"http://service.simile-widgets.org/babel/preview#%7BWeb%7D%20caching%20with%20consistent%20hashing" }, {"id":"A spreadsheet-based user interface for managing plural relationships in structured data", "label":"A spreadsheet-based user interface for managing plural relationships in structured data", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:96220691c17bc63123f3de7ad5d4a438", "modified":"no", "pages":"2541--2550", "date":"2011-05", "author":["Bakke, Eirik","Karger, David","Miller, Rob"], "doi":"http://doi.acm.org/10.1145/1978942.1979313", "acmid":"1979313", "venue":"CHI", "publisher":"ACM", "month":"May", "cat":["CHI","Systems","Information Retrieval"], "series":"CHI '11", "key":"Karger:Spreadsheets", "numpages":"10", "year":"2011", "isbn":"978-1-4503-0228-9", "pdf": "http://people.csail.mit.edu/ebakke/research/related_worksheets_chi2011.pdf", "pub-type":"inproceedings", "keywords":["databases","foreign key relationships","hierarchical views","one-to-many relationships","spreadsheets"], "booktitle":"Proceedings of the 2011 annual conference on Human factors in computing systems", "location":"Vancouver, BC, Canada", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#A%20spreadsheet-based%20user%20interface%20for%20managing%20plural%20relationships%20in%20structured%20data" }, {"id":"Talking about data: sharing richly structured information through blogs and wikis", "label":"Talking about data: sharing richly structured information through blogs and wikis", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:b4dd5763072cc42192b0764e33f4fcb3", "modified":"no", "note":"poster", "pages":"1057--1058", "date":"2010-05", "author":["Benson, Edward","Marcus, Adam","Howahl, Fabian","Karger, David"], "url":"http://projects.csail.mit.edu/datapress", "doi":"http://doi.acm.org/10.1145/1772690.1772802", "venue":"WWW", "publisher":"ACM", "month":"May", "cat":["CHI","Visualization","Databases"], "key":"Karger:Datapress-Poster", "year":"2010", "isbn":"978-1-60558-799-8", "pub-type":["misc","poster"], "booktitle":"WWW '10: Proceedings of the 19th international conference on World wide web", "location":"Raleigh, North Carolina, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Talking%20about%20data%3A%20sharing%20richly%20structured%20information%20through%20blogs%20and%20wikis" }, {"id":"A Nearly Optimal Oracle for Avoiding Failed Vertices and Edges", "label":"A Nearly Optimal Oracle for Avoiding Failed Vertices and Edges", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:98b7dc8f04703982b893c7d2c7c6a270", "modified":"no", "crossref":"STOC 2009: Proceedings of the 41st Annual ACM Symposium on Theory of Computing", "abstract":"We present an improved oracle for the distance sensitivity problem. The goal is to preprocess a directed graph $G = (V,E)$ with non-negative edge weights to answer queries of the form: what is the length of the shortest path from $x$ to $y$ that does not go through some failed vertex or edge $f$. The previous best algorithm produces an oracle of size $O(n^2)$ that has an $O(1)$ query time, and an $O(n^2\\sqrt(m))$ construction time. It was a randomized Monte Carlo algorithm that worked with high probability. Our oracle also has a constant query time and an $O(n^2)$ space requirement, but it has an improved construction time of $O(mn)$, and it is deterministic. Note that $O(1)$ query, $O(n^2)$ space, and $O(mn)$ construction time is also the best known bound (up to logarithmic factors) for the simpler problem of finding all pairs shortest paths in a weighted, directed graph. Thus, barring improved solutions to the all pairs shortest path problem, our oracle is optimal up to logarithmic factors. ", "pages":"101--110", "date":"2009-05", "author":["Bernstein, Aaron","Karger, David"], "bibsource":"DBLP, http://dblp.uni-trier.de", "editor":"Michael Mitzenmacher", "venue":"STOC", "publisher":"ACM", "month":"May", "cat":"Theory", "key":"Karger:OptimalDistanceOracle", "year":"2009", "isbn":"978-1-60558-506-2", "pdf":"Papers/optimalDistanceOracle.pdf", "pub-type":"inproceedings", "booktitle":"STOC 2009: Proceedings of the 41st Annual ACM Symposium on Theory of Computing", "origin":"http://service.simile-widgets.org/babel/preview#A%20Nearly%20Optimal%20Oracle%20for%20Avoiding%20Failed%20Vertices%20and%20Edges" }, {"id":"Human Powered Sorts and Joins", "label":"Human Powered Sorts and Joins", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:dff65382d1c567157bde4f11880d3b92", "modified":"no", "date":"2012-08", "author":["Marcus, Adam","Wu, Eugene","Karger, David","Madden, Samuel","Miller, Robert"], "month":"August", "cat":["CHI","Databases","Crowdsourcing"], "key":"Karger:HumanJoins-conf", "year":"2012", "pub-type":"inproceedings", "booktitle":"Proceedings of the 38th International Conference on Very Large Databases", "origin":"http://service.simile-widgets.org/babel/preview#Human%20Powered%20Sorts%20and%20Joins" }, {"id":"INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery", "label":"INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:7eeecd18dac3f5977da84f098b680541", "modified":"no", "abstract":"The decreasing cost of computing technology is speeding the deployment of abundant ubiquitous computation and communication.", "date":"2002-08", "author":["Balazinska, Magdalena","Balakrishnan, Hari","Karger, David"], "venue":"Pervasive", "month":"August", "cat":["Systems","P2P"], "key":"twine:pervasive02", "year":"2002", "pdf":"http://project-iris.net/irisbib/papers/twine:pervasive02/paper.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the International Conference on Pervasive Computing (Pervasive 2002)", "address":"Zurich, Switzerland", "origin":"http://service.simile-widgets.org/babel/preview#INS%2FTwine%3A%20A%20Scalable%20Peer-to-Peer%20Architecture%20for%20Intentional%20Resource%20Discovery" }, {"id":"Inky: A Sloppy Command Line for the Web with Rich Visual Feedback", "label":"Inky: A Sloppy Command Line for the Web with Rich Visual Feedback", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:19ff85128a4bae9fc135a4ea6d657fec", "modified":"no", "pages":"131--140", "date":"2008-10", "author":["Miller, Rob","Chou, Victoria","Bernstein, Michael","Little, Greg","Van Kleek, Max","Karger, David","schraefel, mc"], "venue":"UIST", "month":"October", "cat":["CHI","Information Retrieval"], "key":"Karger:Inky", "year":"2008", "pub-type":"inproceedings", "booktitle":"21st Symposium on User Interface Software Technology (UIST)", "address":"Monterey, CA", "origin":"http://service.simile-widgets.org/babel/preview#Inky%3A%20A%20Sloppy%20Command%20Line%20for%20the%20Web%20with%20Rich%20Visual%20Feedback" }, {"id":"Approximation Schemes for Minimizing Average Weighted Completion Time with Release Dates", "label":"Approximation Schemes for Minimizing Average Weighted Completion Time with Release Dates", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:178f9cbdb7528f0607e53bdce5fb4668", "modified":"no", "crossref":"ba85c9b8de9b48025044c496454deb57", "place":"New York, NY", "abstract":"We consider the problem of scheduling jobs with release dates on parallel machines so as to minimize average weighted completion time. While constant factor approximation algorithms for many variants have been developed in the last few years, we present the first known polynomial time approximation schemes for several of them. Our results include PTASs for the case of identical parallel machines and a constant number of unrelated machines with and without preemption allowed. Our PTASs are also efficient. For most variants, the running time for a (1+e)-approximation on an instance with n jobs and m machines is O(nlog n) for each fixed e, and for all variants the running time's dependence on n is a fixed polynomial whose degree is independent of e and m.", "pages":"32--43", "date":"1999-10", "ps":"release.ps", "author":["Afrati, Foto","Bampis, Evripidis","Chekuri, Chandra","Karger, David","Kenyon, Claire","Khanna, Sanjeev","Milis, Ioannis","Queyranne, Maurice","Skutella, Martin","Stein, Cliff","Sviridenko, Maxim"], "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"October", "cat":"Theory", "key":"Karger:Release", "year":"1999", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$32^{nd}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#Approximation%20Schemes%20for%20Minimizing%20Average%20Weighted%20Completion%20Time%20with%20Release%20Dates" }, {"id":"Basic Concepts for Managing Semi-structured Information in Haystack", "label":"Basic Concepts for Managing Semi-structured Information in Haystack", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:51a0b29370d6f2d6e68ea0d00de5f823", "modified":"no", "date":"2002", "author":["Quan, Dennis","Huynh, David F.","Sinha, Vineet","Zhurakhinskaya, Marina","Karger, David"], "venue":"Oxygen", "cat":["Haystack","Semantic Web"], "key":"Quan", "year":"2002", "pub-type":"inproceedings", "booktitle":"2nd Annual Student Oxygen Workshop, Gloucester, MA, USA", "origin":"http://service.simile-widgets.org/babel/preview#Basic%20Concepts%20for%20Managing%20Semi-structured%20Information%20in%20Haystack" }, {"id":"DDoS defense by offense", "label":"DDoS defense by offense", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4863b84ea41e309969118bd104aad129", "modified":"no", "abstract":"This article presents the design, implementation, analysis, and experimental evaluation of speak-up, a defense against application-level distributed denial-of-service (DDoS), in which attackers cripple a server by sending legitimate-looking requests that consume computational resources (e.g., CPU cycles, disk). With speak-up, a victimized server encourages all clients, resources permitting, to automatically send higher volumes of traffic. We suppose that attackers are already using most of their upload bandwidth so cannot react to the encouragement. Good clients, however, have spare upload bandwidth so can react to the encouragement with drastically higher volumes of traffic. The intended outcome of this traffic inflation is that the good clients crowd out the bad ones, thereby capturing a much larger fraction of the server's resources than before. We experiment under various conditions and find that speak-up causes the server to spend resources on a group of clients in rough proportion to their aggregate upload bandwidths, which is the intended result.", "pages":"1--54", "date":"2010-03", "author":["Walfish, Michael","Vutukuru, Mythili","Balakrishnan, Hari","Karger, David","Shenker, Scott"], "doi":"http://doi.acm.org/10.1145/1731060.1731063", "volume":"28", "publisher":"ACM", "month":"March", "cat":["Systems","P2P"], "journal":"ACM Transactions on Computer Systems", "key":"Karger:Speakup-journal", "year":"2010", "pdf":"www.cs.utexas.edu/~mwalfish/papers/speakup-tocs10.pdf", "pub-type":"article", "issn":"0734-2071", "number":"1", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#DDoS%20defense%20by%20offense" }, {"id":"On the costs and benefits of procrastination: approximation algorithms for stochastic combinatorial optimization problems", "label":"On the costs and benefits of procrastination: approximation algorithms for stochastic combinatorial optimization problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:9b1b857eba2e4d8933700681c2f1adc4", "modified":"no", "crossref":"Proceedings of the {$15^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "abstract":"Combinatorial optimization is often used to \"plan ahead,\" purchasing and allocating resources for demands that are not precisely known at the time of solution. This advance planning may be done because resources become very expensive to purchase or difficult to allocate at the last minute when the demands are known. In this work we study the tradeoffs involved in making some purchase/allocation decisions early to reduce cost while deferring others at greater expense to take advantage of additional, late-arriving information. We consider a number of combinatorial optimization problems in which the problem instance is uncertain---modeled by a probability distribution---and in which solution elements can be purchased cheaply now or at greater expense after the distribution is sampled. We show how to approximately optimize the choice of what to purchase in advance and what to defer.", "pages":"691--700", "date":"2004-01", "author":["Immorlica, Nicole","Karger, David","Minkoff, Maria","Mirrokni, Vahab S."], "venue":"SODA", "publisher":"Society for Industrial and Applied Mathematics", "month":"January", "cat":"Theory", "key":"Karger:2Stage", "year":"2004", "isbn":"0-89871-558-X", "pdf":"http://www.ece.northwestern.edu/~nickle/pubs/stochopt.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$15^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "location":"New Orleans, Louisiana", "address":"Philadelphia, PA, USA", "origin":"http://service.simile-widgets.org/babel/preview#On%20the%20costs%20and%20benefits%20of%20procrastination%3A%20approximation%20algorithms%20for%20stochastic%20combinatorial%20optimization%20problems" }, {"id":"Linear Network Codes: A Unified Framework for Source, Channel, and Network Coding", "label":"Linear Network Codes: A Unified Framework for Source, Channel, and Network Coding", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f3353948284c3ce44b65cbe2a9384f0d", "modified":"no", "note":"Invited Paper", "abstract":"We examine the issue of separation and code design for network data transmission environments. We demonstrate that source-channel sep- aration holds for several canonical network channel models when the whole network operates over a common ऒnite ऒeld. Our approach uses linear codes. This simple, unifying framework allows us to re-establish with economy the optimality of linear codes for single transmitter channels and for Slepian-Wolf source coding. It also enables us to establish the optimality of linear codes for multiple access channels and for erasure broadcast channels. Moreover, we show that source-channel separation holds for these networks. This robustness of separation we show to be strongly predicated on the fact that noise and inputs are independent. The linearity of source, channel, and network coding blurs the delineation between these codes, and thus we explore joint linear de- sign. Finally, we illustrate the fact that design for individual network modules may yield poor results when such modules are concatenated, demonstrating that end-to-end coding is necessary. Thus, we argue, it is the lack of decom- posability into canonical network modules, rather than the lack of separation between source and channel coding, that presents major challenges for coding in networks.", "date":"2003", "author":["Effros, Michelle","M\\'{e}dard, Muriel","Ho, Tracey","Ray, S.","Karger, David","Koetter, Ralf","Hassibi, B."], "venue":"DIMACS", "cat":["Theory","Coding","Cuts and Flows"], "key":"Karger:NCoding4", "year":"2003", "pdf":"http://www.its.caltech.edu/~tho/dimacs03.pdf", "pub-type":"inproceedings", "booktitle":"DIMACS workshop on network information theory", "origin":"http://service.simile-widgets.org/babel/preview#Linear%20Network%20Codes%3A%20A%20Unified%20Framework%20for%20Source%2C%20Channel%2C%20and%20Network%20Coding" }, {"id":"First-price path auctions", "label":"First-price path auctions", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:3ac6a7d384528c1cc9ffcf78b77a0f4c", "modified":"no", "abstract":"We study first-price auction mechanisms for auctioning flow between given nodes in a graph. We assume edges are independent agents with fixed capacities and costs, and their objective is to maximize their profit. We characterize all {\\it strong $\\epsilon$-Nash equilibria} of a first-price auction for this problem, and show that the total payment is never significantly more than, and often less than, the well known dominant strategy Vickrey-Clark-Groves (VCG) mechanism. We then present a randomized version of the first-price auction, for which the equilibrium condition can be relaxed to $\\epsilon$-Nash equilibrium. We next consider a model in which the amount of demand is uncertain, but its probability distribution is known to the edges. For this model, we show that a simple {\\em ex ante} first-price auction may not have any $\\epsilon$-Nash equilibria. We then present a modified auction mechanism with $2$-parameter bids, and show that it has an $\\epsilon$-Nash equilibrium.", "pages":"203--212", "date":"2005-06", "ps":"http://www.umich.edu/~rsami/papers/PAdraft.ps", "author":["Immorlica, Nicole","Karger, David","Nikolova, Evdokia","Sami, Rahul"], "doi":"http://doi.acm.org/10.1145/1064009.1064031", "venue":"EC", "publisher":"ACM Press", "month":"June", "cat":["Theory","Mechanism Design"], "key":"Karger:FirstPathAuction", "year":"2005", "isbn":"1-59593-049-3", "pub-type":"inproceedings", "booktitle":"EC '05: Proceedings of the 6th ACM conference on Electronic commerce", "location":"Vancouver, BC, Canada", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#First-price%20path%20auctions" }, {"id":"On Coding for Non-Multicast Networks", "label":"On Coding for Non-Multicast Networks", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:c2879055e29abd549484ee0003359fdd", "modified":"no", "note":"Invited paper", "abstract":"We consider the issue of coding for non-multicast networks. For multicast networks, it is known that linear operations over a field no larger than the number of receivers are sufficient to achieve all feasible connections. In the case of nonmulticast networks, necessary and sufficient conditions are known, if we restrict ourselves to linear codes over a finite field [1]. However, no linearity sufficiency results exist for non-multicast networks. Indeed, [2] shows that linearity over a field is not sufficient in general. We present a coding theorem that provides necessary and sufficient conditions, in terms of receiver entropies, for an arbitrary set of connections to be achievable on any network. We conjecture that linearity is sufficient to satisfy the coding theorem, when linear operations are performed over vectors rather than scalars in a field. We illustrate the intuition of this conjecture with an example. This work is part of an ongoing cooperation with R. Koetter.", "date":"2003", "author":["M\\'{e}dard, Muriel","Effros, Michelle","Ho, Tracey","Karger, David"], "venue":"Allerton", "cat":["Theory","Applications of Theory","Coding","Cuts and Flows"], "key":"Karger:NonMulticast", "year":"2003", "pdf":"http://www.its.caltech.edu/~tho/allertonM03.pdf", "pub-type":"inproceedings", "booktitle":"$41^{st}$ Allerton Annual Conference on Communication, Control, and Signal Processing", "origin":"http://service.simile-widgets.org/babel/preview#On%20Coding%20for%20Non-Multicast%20Networks" }, {"id":"Fresnel: A Browser-Independent Presentation Vocabulary for RDF", "label":"Fresnel: A Browser-Independent Presentation Vocabulary for RDF", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:fd71c80d7d45f1cdc372c75133ee47c7", "modified":"no", "crossref":"$5^{th}$ International Semantic Web Conference (ISWC)", "abstract":"Semantic Web browsers and other tools aimed at displaying RDF data to end users are all concerned with the same problem: presenting content primarily intended for machine consumption in a human-readable way. Their solutions differ but in the end address the same two high-level issues, no matter the underlying representation paradigm: specifying (i) what information contained in RDF models should be presented (content selection) and (ii) how this information should be presented (content formatting and styling). However, each tool currently relies on its own ad hoc mechanisms and vocabulary for specifying RDF presentation knowledge, making it difficult to share and reuse such knowledge across applications. Recognizing the general need for presenting RDF content to users and wanting to promote the exchange of presentation knowledge, we designed Fresnel as a browser-independent vocabulary of core RDF display concepts. In this paper we describe Fresnel's main concepts and present several RDF browsers and visualization tools that have adopted the vocabulary so far.", "pages":"158--171", "slides":"Papers/fresnel-talk.pdf", "date":"2006-11", "author":["Pietriga, Emmanuel","Bizer, Chris","Karger, David","Lee, Ryan"], "doi":"10.1007/11926078_12", "venue":"ISWC", "month":"November", "cat":["Information Retrieval","Haystack","Semantic Web"], "key":"Karger:Fresnel-ISWC06", "year":"2006", "pdf":"Papers/fresnel.pdf", "pub-type":"inproceedings", "booktitle":"$5^{th}$ International Semantic Web Conference (ISWC)", "location":"Athens, GA", "origin":"http://service.simile-widgets.org/babel/preview#Fresnel%3A%20A%20Browser-Independent%20Presentation%20Vocabulary%20for%20RDF" }, {"id":"Web Caching with Consistent Hashing", "label":"Web Caching with Consistent Hashing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4a28948b78f029723e8f55ae9c5ad124", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "date":"1999-05", "author":["Karger, David","Sherman, Alex","Berkheimer, Andy","Bogstad, Bill","Dhanidina, Rizwan","Iwamoto, Ken","Kim, Brian","Matkins, Luke","Yerushalmi, Yoav"], "url":"http://www8.org/w8-papers/2a-webserver/caching/paper2.html", "venue":"WWW", "month":"May", "cat":["Systems","Theory","Applications of Theory"], "key":"Karger:ImpWeb-Conf", "comments":"A key performance measure for the World Wide Web is the speed with which content is served to users. As traffic on the Web increases, users are faced with increasing delays and failures in data delivery. Web caching is one of the key strategies that has been explored to improve performance. An important issue in many caching systems is how to decide what is cached where at any given time. Solutions have included multicast queries and directory schemes. In this paper, we offer a new Web caching strategy based on consistent hashing. Consistent hashing provides an alternative to multicast and directory schemes, and has several other advantages in load balancing and fault tolerance. Its performance was analyzed theoretically in previous work: in this paper we describe the implementation of a consistent-hashing-based system and experiments that support our thesis that it can provide performance improvements.", "year":"1999", "pub-type":"inproceedings", "keywords":"load balancing", "booktitle":"Proceedings of the Eighth World-Wide Web Conference", "origin":"http://service.simile-widgets.org/babel/preview#Web%20Caching%20with%20Consistent%20Hashing" }, {"id":"Building Peer-to-Peer Systems With Chord, a Distributed Lookup Service", "label":"Building Peer-to-Peer Systems With Chord, a Distributed Lookup Service", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:22badda4b5c45d40477cba667ca4f1c8", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"We argue that the core problem facing peer-to-peer systems is locating documents in a decentralized network and propose Chord, a distributed lookup primitive. Chord provides an efficient method of locating documents while placing few constraints on the applications that use it. As proof that Chord's functionality is useful in the development of peer-to-peer applications, we outline the implementation of a peer-to-peer file sharing system based on Chord.", "date":"2001-05", "ps":"http://www.pdos.lcs.mit.edu/papers/chord:hotos01/hotos8.ps", "author":["Dabek, Frank","Brunskill, Emma","Kaashoek, M. Frans","Karger, David","Morris, Robert","Stoica, Ion","Balakrishnan, Hari"], "venue":"HotOS", "month":"May", "cat":["Theory","Systems","P2P","Applications of Theory"], "key":"Karger:Chord-hotos", "year":"2001", "pub-type":"inproceedings", "booktitle":"Proceedings of the 8th {W}orkshop on {H}ot {T}opics in {O}perating {S}ystems ({HotOS-VIII})", "organization":"IEEE Computer Society", "address":"Schloss Elmau, Germany", "origin":"http://service.simile-widgets.org/babel/preview#Building%20Peer-to-Peer%20Systems%20With%20Chord%2C%20a%20Distributed%20Lookup%20Service" }, {"id":"Looking up data in P2P systems", "label":"Looking up data in P2P systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cfe981093bc6afabc0f66a38181f1fce", "modified":"no", "abstract":"The main challenge in P2P computing is to design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains. The participants in a typical P2P system might include computers at homes, schools, and businesses, and can grow to several million concurrent participants.", "date":"2003-02", "author":["Balakrishnan, Hari","Kaashoek, M. Frans","Karger, David","Morris, Robert","Stoica, Ion"], "month":"February", "cat":["Systems","P2P"], "journal":"Communications of the ACM", "key":"dht:cacm03", "year":"2003", "pdf":"http://project-iris.net/irisbib/papers/dht:cacm03/paper.pdf", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Looking%20up%20data%20in%20P2P%20systems" }, {"id":"9fb1b5507abcef0feec4e1102e548852", "label":"Using Linear Programming to Decode Linear Codes", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:9fb1b5507abcef0feec4e1102e548852", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"Given a linear code and observations from a noisy channel, the decoding problem is to de- termine the most likely (ML) codeword. We describe a method for approximate ML decoding of an arbi- trary binary linear code, based on a linear program- ming (LP) relaxation that is defined by a factor graph or parity check representation of the code. The result- ing LP decoder, which generalizes our previous work on turbo-like codes [FK02, FWK02], has the ML cer- tificate property: it either outputs the ML codeword with a guarantee of correctness, or acknowledges an error. We provide a precise characterization of when the LP decoder succeeds, based on the cost of pseu- docodewords associated with the factor graph. We in- troduce the notion of the fractional distance of a code, defined with respect to a particular LP relax- ation, and prove that the LP decoder will correct up to $()$ errors. For the BEC, we prove that the performance of LP decoding is equivalent to standard iterative decoding.", "date":"2003-03", "author":["Feldman, Jon","Karger, David","Wainwright, Martin"], "venue":"CISS", "month":"March", "cat":["Theory","Coding"], "key":"Karger:LDPC", "year":"2003", "pdf":"Papers/ciss03.pdf", "pub-type":"inproceedings", "booktitle":"37th annual Conference on Information Sciences and Systems (CISS)", "psgz":"Papers/ciss03.ps.gz", "address":"Baltimore, MD", "origin":"http://service.simile-widgets.org/babel/preview#9fb1b5507abcef0feec4e1102e548852" }, {"id":"535ac659e6b55dbd60f424f8af71b223", "label":"Piggy Bank: Experience the Semantic Web Inside your Web Browser", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:535ac659e6b55dbd60f424f8af71b223", "modified":"no", "abstract":"The Semantic Web Initiative envisions a Web wherein information is offered free of presentation, allowing more effective exchange and mixing across web sites and across web pages. But without substantial Semantic Web content, few tools will be written to consume it; without many such tools, there is little appeal to publish Semantic Web content.
To break this chicken-and-egg problem, thus enabling more flexible information access, we have created a web browser extension called Piggy Bank that lets users make use of Semantic Web content within Web content as users browse the Web. Wherever Semantic Web content is not available, Piggy Bank can invoke screenscrapers to re-structure information within web pages into Semantic Web format. Through the use of Semantic Web technologies, Piggy Bank provides direct, immediate benefits to users in their use of the existing Web. Thus, the existence of even just a few Semantic Web-enabled sites or a few scrapers already benefits users. Piggy Bank thereby offers an easy, incremental upgrade path to users without requiring a wholesale adoption of the Semantic Web's vision.
To further improve this Semantic Web experience, we have created Semantic Bank, a web server application that lets Piggy Bank users share the Semantic Web information they have collected, enabling collaborative efforts to build sophisticated Semantic Web information repositories through simple, everyday's use of Piggy Bank.
", "date":"2005-11", "author":["Huynh, David","Mazzochi, Stefano","Karger, David"], "pdfkb":"2400", "venue":"ISWC", "month":"November", "cat":["Haystack","Semantic Web","Information Retrieval"], "key":"Karger:Piggy", "year":"2005", "pdf":"http://simile.mit.edu/papers/iswc05.pdf", "pub-type":"inproceedings", "booktitle":"International Semantic Web Conference (ISWC)", "origin":"http://service.simile-widgets.org/babel/preview#535ac659e6b55dbd60f424f8af71b223" }, {"id":"Tie strength in question \\& answer on social network sites", "label":"Tie strength in question \\& answer on social network sites", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:0529ed42e97231db71c63b162f17decd", "modified":"no", "pages":"1057--1066", "date":"2012", "author":["Panovich, Katrina","Miller, Rob","Karger, David"], "doi":"http://dx.doi.org/10.1145/2145204.2145361", "acmid":"2145361", "venue":"CSCW", "publisher":"ACM", "cat":["CHI","Information Retrieval"], "series":"CSCW '12", "key":"Karger:TieStrength", "numpages":"10", "year":"2012", "isbn":"978-1-4503-1086-4", "pub-type":"inproceedings", "keywords":["q\\&\\#38;a","social network q\\&\\#38;a.","social networks","social search"], "booktitle":"Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work", "location":"Seattle, Washington, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Tie%20strength%20in%20question%20%5C%26%20answer%20on%20social%20network%20sites" }, {"id":"Towards a unification & integration of PIM support", "label":"Towards a unification & integration of PIM support", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a733cb2f6acb308e7034b48d57d53ba8", "modified":"no", "note":"Report from NSF workshop", "date":"2005", "author":["Jones, William","Karger, David","Bergman, Ofer","Franklin, Mike","Pratt, A","Bates, Marcia"], "key":"Karger:Unification", "year":"2005", "pub-type":"misc", "origin":"http://service.simile-widgets.org/babel/preview#Towards%20a%20unification%20%26%20integration%20of%20PIM%20support" }, {"id":"Wide-area cooperative storage with CFS", "label":"Wide-area cooperative storage with CFS", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:ca4e37b98c9308aa923ce2d71fa4c4d4", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"The Cooperative File System (CFS) is a new peer-to-peer read-only storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using the Chord location protocol, which operates in time logarithmic in the number of servers.CFS is implemented using the SFS file system toolkit and runs on Linux, OpenBSD, and FreeBSD. Experience on a globally deployed prototype shows that CFS delivers data to clients as fast as FTP. Controlled tests show that CFS is scalable: with 4,096 servers, looking up a block of data involves contacting only seven servers. The tests also demonstrate nearly perfect robustness and unimpaired performance even when as many as half the servers fail.", "date":"2001-10", "author":["Dabek, Frank","Kaashoek, M. Frans","Karger, David","Morris, Robert","Stoica, Ion"], "venue":"SOSP", "month":"October", "cat":["Theory","Systems","P2P","Applications of Theory"], "key":"Karger:Chord-FS", "year":"2001", "pdf":"http://www.cs.ucsd.edu/sosp01/papers/morris.pdf", "pub-type":"inproceedings", "confurl":"http://portal.acm.org/toc.cfm?id=502034&idx=SERIES372&type=proceeding&coll=ACM&dl=ACM&part=series&WantType=Proceedings", "booktitle":"Proceedings of the {ACM} 2001 Symposium on Operating System Principles", "location":"Banff, Canada", "origin":"http://service.simile-widgets.org/babel/preview#Wide-area%20cooperative%20storage%20with%20CFS" }, {"id":"Improved Distance Sensitivity Oracles via Random Sampling", "label":"Improved Distance Sensitivity Oracles via Random Sampling", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:dd9d2059f308d684b9d40ab2529be221", "modified":"no", "crossref":"Proceedings of the {$19^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"San Francisco, CA", "abstract":"We present improved oracles for the distance sensitivity problem. The goal is to preprocess a graph $G = (V,E)$ with non-negative edge weights to answer queries of the form: what is the length of the shortest path from x to y that does not go through some failed vertex or edge f. There are two state of the art algorithms for this problem. The first produces an oracle of size $O(n^2)$ that has an O(1) query time, and an $O(mn^2)$ construction time. The second oracle has size $O(n^{2.5})$, but the construction time is only mn1.5). We present two new oracles that substantially improve upon both of these results. Both oracles are constructed with randomized, Monte Carlo algorithms. For directed graphs with non-negative edge weights, we present an oracle of size $O(n^2)$, which has an $O(1)$ query time, and an $O(n^2m)$ construction time. For unweighted graphs, we achieve a more general construction time of $O(n^{3/2) \\cdot APSP + mn)$, where APSP is the time it takes to compute all pairs shortest paths in an aribtrary subgraph of $G$.}, pdf={Papers/detour.pdf}, pages = {34-43} ", "date":"2008-01", "author":["Bernstein, Aaron","Karger, David"], "venue":"SODA", "publisher":"ACM Press", "month":"January", "cat":"Theory", "key":"Karger:Detours", "year":"2008", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$19^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Improved%20Distance%20Sensitivity%20Oracles%20via%20Random%20Sampling" }, {"id":"End-User Application Development for the Semantic Web", "label":"End-User Application Development for the Semantic Web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:b8791552b7a3b9b5702a27dc308606dc", "modified":"no", "abstract":"Although a lot of information has become readily accessible and necessary for daily work, the current infrastructure for managing information is ill-suited for information-oriented activities: information and functionality are scattered across applications and websites, making it difficult to aggregate and reuse just the right set of content and operations required for unique user tasks. We discuss a collection of tools built into the Haystack platform that address many of the shortcomings of current applications, and allow composing reusable fragments of information and associated operations and views from the Semantic Web into a task workspace tailored to the user and the task. Users can change the workspace to immediately meet changing requirements to easily include, remove or reuse information in multiple tasks simultaneously. The time that a user invests for the initial setup and occasional updates to the workspace is amortized over the numerous times he or she returns to the task, and all relevant information resources are co-located and ready to use.", "date":"2005-10", "author":["Bakshi, Karun","Karger, David"], "event":"Semantic Desktop Workshop", "pdfkb":"1465", "venue":"ISWC", "month":"October", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Workspace", "year":"2005", "pdf":"Papers/semdesk-final.pdf", "pub-type":"inproceedings", "confurl":"http://iswc2005.semanticweb.org/", "booktitle":"Semantic Desktop 2005 Workshop, International Semantic Web Conference (ISWC)", "origin":"http://service.simile-widgets.org/babel/preview#End-User%20Application%20Development%20for%20the%20Semantic%20Web" }, {"id":"AtomsMasher: Personal Reactive Automation for the Web", "label":"AtomsMasher: Personal Reactive Automation for the Web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:be3a10fa19ea6e46587790468c541cb8", "modified":"no", "note":"Poster.", "date":"2008-10", "author":["Van Kleek, Max","Andre, Paul","Perttunen, Mikko","Karger, David","Miller, Rob","schraefel, mc"], "venue":"UIST", "month":"October", "cat":["CHI","Information Retrieval"], "key":"Karger:AtomSmasherPoster", "year":"2008", "pub-type":["misc","poster"], "booktitle":"21st Symposium on User Interface Software Technology (UIST)", "address":"Monterey, CA", "origin":"http://service.simile-widgets.org/babel/preview#AtomsMasher%3A%20Personal%20Reactive%20Automation%20for%20the%20Web" }, {"id":"Sync kit: a persistent client-side database caching toolkit for data intensive websites", "label":"Sync kit: a persistent client-side database caching toolkit for data intensive websites", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:18631efa37ec837555fc5096e643c5a0", "modified":"no", "abstract":"The web has dramatically enhanced people's ability to communicate ideas, knowledge, and opinions. But the authoring tools that most people understand, blogs and wikis, primarily guide users toward authoring text. In this work, we show that substantial gains in expressivity and communication would accrue if people could easily share richly structured information in meaningful visualizations. We then describe several extensions we have created for blogs and wikis that enable users to publish, share, and aggregate such structured information using the same workflows they apply to text. In particular, we aim to preserve those attributes that make blogs and wikis so effective: one-click access to the information, one-click publishing of content, natural authoring interfaces, and the ability to easily copy-and-paste information and visualizations from other sources.", "pages":"121--130", "date":"2010-04", "author":["Benson, Edward","Marcus, Adam","Karger, David","Madden, Samuel"], "doi":"http://doi.acm.org/10.1145/1772690.1772704", "venue":"WWW", "publisher":"ACM", "month":"April", "cat":["Systems","Semantic Web","Databases"], "key":"Karger:SyncKit", "year":"2010", "isbn":"978-1-60558-799-8", "pdf":"http://people.csail.mit.edu/marcua/papers/synckit-www2010.pdf", "pub-type":"inproceedings", "booktitle":"WWW '10: Proceedings of the 19th international conference on World wide web", "location":"Raleigh, North Carolina, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Sync%20kit%3A%20a%20persistent%20client-side%20database%20caching%20toolkit%20for%20data%20intensive%20websites" }, {"id":"Prim-{D}ijkstra Tradeoffs for Improved Performance-Driven Routing Tree Design", "label":"Prim-{D}ijkstra Tradeoffs for Improved Performance-Driven Routing Tree Design", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1203ae4f636588bf84335950e82e9d87", "modified":"no", "abstract":"Analysis of Elmore delay in distributed RC tree structures shows the influence of both tree cost and tree radius on signal delay in VLSI interconnects. We give new and efficient interconnection tree constructions that smoothly combine the minimum cost and the minimum radius objectives, by combining respectively optimal algorithms due to Prim and Dijkstra. Previous \"shallow-light\" techniques [2, 3, 8, 13] are both less direct and less effective: in practice, our methods achieve uniformly superior cost-radius tradeoffs. Detailed timing simulations for a range of IC and MCM interconnect technologies show that our wirelength savings yield reduced signal delays when compared to shallow-light or standard minimum spanning tree and Steiner tree routing.", "pages":"890--895", "date":"1995-07", "author":["Alpert, C. J.","Hu, T. C.","Huang, J. H.","Kahng, A. B.","Karger, David"], "volume":"14", "month":"July", "cat":"Applications of Theory", "journal":"IEEE Transactions on Computer Aided Design", "key":"Karger:VLSI", "year":"1995", "pub-type":"article", "number":"7", "origin":"http://service.simile-widgets.org/babel/preview#Prim-%7BD%7Dijkstra%20Tradeoffs%20for%20Improved%20Performance-Driven%20Routing%20Tree%20Design" }, {"id":"A Polynomial-Time Approximation Scheme for Weighted Planar Graph TSP", "label":"A Polynomial-Time Approximation Scheme for Weighted Planar Graph TSP", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:ea11ba8ebdfb284535a6c48aa0e2955c", "modified":"no", "crossref":"Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"San Fransisco, CA", "abstract":"Given a planar graph on $n$ nodes with costs (weights) on its edges, define the distance between nodes $i$ and $j$ as the length of the shortest path between $i$ and~$j$. Consider this as an instance of {\\em metric\\/} TSP. For any $\\eps>0$, our algorithm finds a salesman tour of total cost at most $(1+\\eps)$ times optimal in time $n^{O(1/\\eps^2)}$. We also present a quasi-polynomial time algorithm for the Steiner version of this problem. ", "pages":"33--41", "date":"1998-01", "author":["Arora, Sanjeev","Grigni, Michelangelo","Karger, David","Klein, Philip","Woloszyn, Andrzej"], "editor":"Howard Karloff", "venue":"SODA", "month":"January", "cat":"Theory", "key":"Karger:PlanarTSP", "year":"1998", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "psgz":"Papers/qtsp.ps.gz", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#A%20Polynomial-Time%20Approximation%20Scheme%20for%20Weighted%20Planar%20Graph%20TSP" }, {"id":"Learning Classes Correlated to a Hierarchy", "label":"Learning Classes Correlated to a Hierarchy", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e46ce355406da59790c24c38359e63f2", "modified":"no", "institution":"MIT AI Lab", "abstract":"Trees are a common way of organizing large amounts of information by placing items with similar characteristics near one another in the tree. We introduce a classification problem where a given tree structure gives us information on the best way to label nearby elements. We suggest there are many practical problems that fall under this domain. We propose a way to map the classification problem onto a standard Bayesian inference problem. We also give a fast, specialized inference algorithm that incrementally updates relevant probabilities. We apply this algorithm to web-classification problems and show that our algorithm empirically works well.", "date":"2001", "author":["Shih, Lawrence","Karger, David"], "cat":["Information Retrieval","Machine Learning"], "key":"shih2001", "year":"2001", "pub-type":"techreport", "origin":"http://service.simile-widgets.org/babel/preview#Learning%20Classes%20Correlated%20to%20a%20Hierarchy" }, {"id":"Scaling all-pairs overlay routing", "label":"Scaling all-pairs overlay routing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:288cccd5f296b55520330f0ad1046e36", "modified":"no", "abstract":"This paper presents and experimentally evaluates a new algorithm for efficient one-hop link-state routing in full-mesh networks. Prior techniques for this setting scale poorly, as each node incurs quadratic ($n^2$) communication overhead to broadcast its link state to all other nodes. In contrast, in our algorithm each node exchanges routing state with only a small subset of overlay nodes determined by using a quorum system. Using a two round protocol, each node can find an optimal one-hop path to any other node using only $n^{1.5}$ per-node communication. Our algorithm can also be used to find the optimal shortest path of arbitrary length using only $n^{1.5}$ logn per-node communication. The algorithm is designed to be resilient to both node and link failures. We apply this algorithm to a Resilient Overlay Network (RON) system, and evaluate the results using a large-scale, globally distributed set of Internet hosts. The reduced communication overhead from using our improved full-mesh algorithm allows the creation of all-pairs routing overlays that scale to hundreds of nodes, without reducing the system's ability to rapidly find optimal routes.", "pages":"145--156", "date":"2009-12", "author":["Sontag, David","Zhang, Yang","Phanishayee, Amar","Andersen, David G.","Karger, David"], "doi":"http://doi.acm.org/10.1145/1658939.1658956", "publisher":"ACM", "month":"December", "cat":["Applications of Theory","P2P"], "key":"Karger:ScalingOverlay", "year":"2009", "isbn":"978-1-60558-636-6", "pub-type":"inproceedings", "booktitle":"CoNEXT '09: Proceedings of the 5th international conference on Emerging networking experiments and technologies", "location":"Rome, Italy", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Scaling%20all-pairs%20overlay%20routing" }, {"id":"Wrapper Induction for End-User Semantic Content Development", "label":"Wrapper Induction for End-User Semantic Content Development", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:05a43ea53abbd759cd835184d970ecf4", "modified":"no", "abstract":"The transition from existingWorld WideWeb content to the Semantic Web relies on the labeling and classification of existing information before it is useful to end-users and their agents. This paper presents a wrapper induction system designed to allow end-users to create, modify, and utilize semantic patterns on unlabeled World Wide Web documents. These patterns allow users to overlay documents with RDF classes and properties, and then to interact with this labeled content within a larger Semantic Web application, such as Haystack.", "date":"2004-05", "author":["Hogue, Andrew","Karger, David"], "venue":"WWW", "month":"May", "cat":["Haystack","Information Retrieval","Semantic Web","Machine Learning"], "key":"Hogue:Wrapper", "year":"2004", "pub-type":"inproceedings", "booktitle":"Interaction and Design for the Semantic Web Workshop at the $13^{th}$ annual World Wide Web Conference", "address":"New York, NY", "origin":"http://service.simile-widgets.org/babel/preview#Wrapper%20Induction%20for%20End-User%20Semantic%20Content%20Development" }, {"id":"Haystack: Per-User Information Environments.", "label":"Haystack: Per-User Information Environments.", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4ecdcb5b63153e42aa9936431cd36913", "modified":"no", "date":"1997", "author":["Karger, David","Stein, Lynn"], "cat":["Information Retrieval","Haystack"], "key":"Karger97", "year":"1997", "pub-type":"unpublished", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20Per-User%20Information%20Environments." }, {"id":"Infranet: Circumventing Web Censorship and Surveillance", "label":"Infranet: Circumventing Web Censorship and Surveillance", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:252fb10982f3d18f5c401d7cc24a75da", "modified":"no", "note":"Best Student Paper Award", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"An increasing number of countries and companies routinely block or monitor access to parts of the Internet. To counteract these measures, we propose Infranet, a system that enables clients to surreptitiously retrieve sensitive content via cooperating Web servers distributed across the global Internet. These Infranet servers provide clients access to censored sites while continuing to host normal uncensored content. Infranet uses a tunnel protocol that provides a covert communication channel between its clients and servers, modulated over standard HTTP transactions that resemble innocuous Web browsing. In the upstream direction, Infranet clients send covert messages to Infranet servers by associating meaning to the sequence of HTTP requests being made. In the downstream direction, Infranet servers return content by hiding censored data in uncensored images using steganographic techniques. We describe the design, a prototype implementation, security properties, and performance of Infranet. Our security analysis shows that Infranet can successfully circumvent several sophisticated censoring techniques.", "pages":"247--262", "date":"2002-08", "ps":"http://wind.lcs.mit.edu/papers/usenixsec2002.ps", "author":["Feamster, Nick","Balazinska, Magdalena","Harfst, Greg","Balakrishnan, Hari","Karger, David"], "venue":"USENIX Security", "month":"August", "cat":["Systems","Theory","Applications of Theory"], "key":"Karger:Infranet", "year":"2002", "pdf":"http://wind.lcs.mit.edu/papers/usenixsec2002.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the $11^{th}$ USENIX Security Symposium", "psgz":"http://wind.lcs.mit.edu/papers/usenixsec2002.ps.gz", "address":"San Fransisco, CA", "origin":"http://service.simile-widgets.org/babel/preview#Infranet%3A%20Circumventing%20Web%20Censorship%20and%20Surveillance" }, {"id":"Thresher: Automating the Unwrapping of Semantic Content from the World Wide Web", "label":"Thresher: Automating the Unwrapping of Semantic Content from the World Wide Web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8eac4b8970bd8c9794b0c306b7d54563", "modified":"no", "crossref":"Proceedings of the $14^{th}$ International World Wide Web Conference (WWW)", "abstract":"We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify examples of semantic content by highlighting them in a web browser and describing their meaning. We then use the tree edit distance between the DOM subtrees of these examples to create a general pattern, or wrapper, for the content, and allow the user to bind RDF classes and predicates to the nodes of these wrappers. By overlaying matches to these patterns on standard documents inside the Haystack semantic web browser, we enable a rich semantic interaction with existing web pages, \"unwrapping\" semantic data buried in the pages' HTML. By allowing end-users to create, modify, and utilize their own patterns, we hope to speed adoption and use of the Semantic Web and its applications.", "pages":"86--95", "date":"2005-05", "author":["Hogue, Andrew","Karger, David"], "venue":"WWW", "month":"May", "cat":["Haystack","Information Retrieval","Semantic Web","Machine Learning"], "key":"Hogue:Thresher", "year":"2005", "pdf":"http://haystack.lcs.mit.edu/papers/www2005-thresher.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the $14^{th}$ International World Wide Web Conference (WWW)", "origin":"http://service.simile-widgets.org/babel/preview#Thresher%3A%20Automating%20the%20Unwrapping%20of%20Semantic%20Content%20from%20the%20World%20Wide%20Web" }, {"id":"Randomized Decoding for Selection-and-Ordering Problems", "label":"Randomized Decoding for Selection-and-Ordering Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f810ce2d0952605ae633461cef10b724", "modified":"no", "abstract":"The task of selecting and ordering information appears in multiple contexts in text generation and summarization. For instance, methods for title generation construct a headline by selecting and ordering words from the input text. In this paper, we investigate decoding methods that simultaneously optimize selection and ordering preferences. We formalize decoding as a task of finding an acyclic path in a directed weighted graph. Since the problem is NP-hard, finding an exact solution is challenging. We describe a novel decoding method based on a randomized color-coding algorithm. We prove bounds on the number of color-coding iterations necessary to guarantee any desired likelihood of finding the correct solution. Our experiments show that the randomized decoder is an appealing alternative to a range of decoding algorithms for selection-and-ordering problems, including beam search and Integer Linear Programming.", "pages":"444--451", "date":"2007-04", "author":["Deshpande, Pawan","Barzilay, Regina","Karger, David"], "venue":"NAACL", "publisher":"Association for Computational Linguistics", "month":"April", "cat":"Applications of Theory", "key":"deshpande-barzilay-karger:2007:main", "year":"2007", "pdf":"ordering.pdf", "pub-type":"inproceedings", "booktitle":"Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference", "address":"Rochester, New York", "origin":"http://service.simile-widgets.org/babel/preview#Randomized%20Decoding%20for%20Selection-and-Ordering%20Problems" }, {"id":"Eyebrowse: Selective and Public Web Activity Sharing (Demo)", "label":"Eyebrowse: Selective and Public Web Activity Sharing (Demo)", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6592da8b8b69c4164199178f75403b94", "modified":"no", "pages":"122--125", "date":"2016", "author":["Zhang, Amy X.","Karger, David","Blum, Joshua"], "url":"http://doi.acm.org/10.1145/2818052.2874341", "doi":"10.1145/2818052.2874341", "acmid":"2874341", "publisher":"ACM", "series":"CSCW '16 Companion", "key":"Karger:Eyebrowse-Demo", "numpages":"4", "year":"2016", "isbn":"978-1-4503-3950-6", "pub-type":"misc", "keywords":["activity traces","self-presentation","social media","web analytics","web browsing","web tracking"], "booktitle":"Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion", "location":"San Francisco, California, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Eyebrowse%3A%20Selective%20and%20Public%20Web%20Activity%20Sharing" }, {"id":"Faster information dissemination in dynamic networks via network coding", "label":"Faster information dissemination in dynamic networks via network coding", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:b7a8de24efba7bd669a6261616ebfa16", "modified":"no", "pages":"381--390", "date":"2011-06", "author":["Haeupler, Bernhard","Karger, David"], "doi":"http://doi.acm.org/10.1145/1993806.1993885", "acmid":"1993885", "venue":"PODC", "publisher":"ACM", "month":"June", "cat":"Theory", "series":"PODC '11", "key":"Karger:Gossip", "numpages":"10", "year":"2011", "isbn":"978-1-4503-0719-2", "pdf":"http://arxiv.org/abs/1104.2527", "pub-type":"inproceedings", "keywords":["dynamic networks","gossip","multicast","network coding"], "booktitle":"Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing", "location":"San Jose, California, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Faster%20information%20dissemination%20in%20dynamic%20networks%20via%20network%20coding" }, {"id":"Incremental exploratory visualization of relationships in large codebases for program comprehension", "label":"Incremental exploratory visualization of relationships in large codebases for program comprehension", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:252f9ce61c6b503f15e87835d993520f", "modified":"no", "crossref":"OOPSLA '05: Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications", "abstract":" As software systems grow in size and use more third-party libraries and frameworks, the need for developers to understand unfamiliar large codebases is rapidly increasing. In this poster, we present a tool, Relo that supports users' understanding by allowing interactive exploration of code. As the developer explores relationships found in the code, Relo builds and automatically manages a visualization mirroring the developer's mental model, allowing them to group viewed artifacts or use the viewed items to ask the system for further exploration suggestions.", "pages":"116--117", "date":"2005-10", "author":["Sinha, Vineet","Miller, Robert C.","Karger, David"], "doi":"http://doi.acm.org/10.1145/1094855.1094891", "venue":"OOPSLA", "publisher":"ACM Press", "month":"October", "cat":"CHI", "key":"Karger:Relo", "year":"2005", "isbn":"1-59593-193-7", "pdf":"http://relo.csail.mit.edu/documentation/relo-etx05.pdf", "pub-type":"misc", "booktitle":"OOPSLA'05 Eclipse Technology eXchange (ETX) Workshop", "location":"San Diego, CA, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Incremental%20exploratory%20visualization%20of%20relationships%20in%20large%20codebases%20for%20program%20comprehension" }, {"id":"Finders/keepers: a longitudinal study of people managing information scraps in a micro-note tool", "label":"Finders/keepers: a longitudinal study of people managing information scraps in a micro-note tool", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:bde940788f65f37f267666b790a0e9dc", "modified":"no", "pages":"2907--2916", "date":"2011-05", "author":["Van Kleek, Max G.","Styke, Wolfe","schraefel, m. c.","Karger, David"], "doi":"http://doi.acm.org/10.1145/1978942.1979374", "acmid":"1979374", "venue":"CHI", "publisher":"ACM", "month":"May", "cat":["CHI","Systems","Information Retrieval"], "series":"CHI '11", "key":"Karger:Listit-CHI2011", "numpages":"10", "year":"2011", "isbn":"978-1-4503-0228-9", "pdf":"http://people.csail.mit.edu/emax/papers/chi2011-finders-keepers.pdf", "pub-type":"inproceedings", "keywords":["information scraps","longitudinal study","note taking","personal information management","personal organization"], "booktitle":"Proceedings of the 2011 annual conference on Human factors in computing systems", "location":"Vancouver, BC, Canada", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Finders%2Fkeepers%3A%20a%20longitudinal%20study%20of%20people%20managing%20information%20scraps%20in%20a%20micro-note%20tool" }, {"id":"Human-powered sorts and joins", "label":"Human-powered sorts and joins", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:b721c664e43270799494f2277c936e9b", "modified":"no", "issue":"1", "pages":"13--24", "date":"2011-09", "author":["Marcus, Adam","Wu, Eugene","Karger, David","Madden, Samuel","Miller, Robert"], "url":"http://dl.acm.org/citation.cfm?id=2047485.2047487", "issue_date":"September 2011", "acmid":"2047487", "volume":"5", "publisher":"VLDB Endowment", "month":"September", "cat":["CHI","Databases","Machine Learning","Systems"], "journal":"Proc. VLDB Endow.", "key":"Karger:HumanJoin", "numpages":"12", "year":"2011", "pdf":"http://people.csail.mit.edu/marcua/papers/qurk-vldb2012.pdf", "pub-type":"article", "issn":"2150-8097", "origin":"http://service.simile-widgets.org/babel/preview#Human-powered%20sorts%20and%20joins" }, {"id":"Note to Self: Examining Personal Information-Keeping in a Lightweight Note-Taking Tool", "label":"Note to Self: Examining Personal Information-Keeping in a Lightweight Note-Taking Tool", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:12b9499c8b1d830215eda19532600813", "modified":"no", "note":"Best note nominee", "abstract":"This paper describes a longitudinal field experiment in personal note-taking that examines how people capture and use information in short textual notes. Study participants used our tool, a simple browser-based textual note-taking utility, to capture personal information over the course of ten days. We examined the information they kept in notes using the tool, how this information was expressed, and aspects of note creation, editing, deletion, and search. We found that notes were recorded extremely quickly and tersely, combined information of multiple types, and were rarely revised or deleted. The results of the study demonstrate the need for a tool such as ours to support the rapid capture and retrieval of short notes-to-self, and afford insights into how users' actual note-keeping tendencies could be used to better support their needs in future PIM tools.", "pages":"1477-1480", "date":"2009-04", "author":["Van Kleek, Max","Bernstein, Michael","Vargas, Greg","Panovich, Katrina","Karger, David","schraefel, mc"], "doi":"http://doi.acm.org/10.1145/1518701.1518924", "conference":"CHI", "venue":"CHI", "month":"April", "cat":["CHI","Information Retrieval"], "key":"Karger:Listit", "year":"2009", "pdf":"http://people.csail.mit.edu/msbernst/papers/note1546-vankleek.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the ACM CHI Conference on Human Factors in Computing Systems", "organization":"ACM", "address":"Boston, MA, USA", "origin":"http://service.simile-widgets.org/babel/preview#Note%20to%20Self%3A%20Examining%20Personal%20Information-Keeping%20in%20a%20Lightweight%20Note-Taking%20Tool" }, {"id":"Standards Opportunities around Data-Bearing Web Pages", "label":"Standards Opportunities around Data-Bearing Web Pages", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d07e85f502e1d1d9e577ff12f499cc30", "modified":"no", "date":"2012-12", "author":"Karger, David R.", "month":"December", "cat":["CHI","Semantic Web"], "key":"Karger:HCIR-standards", "year":"2012", "pub-type":"inproceedings", "booktitle":"HCIR 2012: the Sixth International Symposium on Human-Computer Interaction and Information Retrieval", "origin":"http://service.simile-widgets.org/babel/preview#Standards%20Opportunities%20around%20Data-Bearing%20Web%20Pages" }, {"id":"Enhancing directed content sharing on the web", "label":"Enhancing directed content sharing on the web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:333f43f7c3b64d976c6e8774ff4074f7", "modified":"no", "abstract":"", "pages":"971--980", "date":"2010-04", "author":["Bernstein, Michael S.","Marcus, Adam","Karger, David R.","Miller, Robert C."], "url":"http://feedme.csail.mit.edu/", "doi":"http://doi.acm.org/10.1145/1753326.1753470", "publisher":"ACM", "month":"April", "cat":["CHI","Machine Learning","Information Retrieval"], "key":"Karger:Feedme", "year":"2010", "isbn":"978-1-60558-929-9", "pdf":"http://people.csail.mit.edu/marcua/papers/feedme-chi2010.pdf", "pub-type":"inproceedings", "booktitle":"CHI '10: Proceedings of the 28th international conference on Human factors in computing systems", "location":"Atlanta, Georgia, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Enhancing%20directed%20content%20sharing%20on%20the%20web" }, {"id":"Improved approximation for graph coloring", "label":"Improved approximation for graph coloring", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a04fd6f5ae105be4c8e8bcb23d8b2062", "modified":"no", "abstract":"We show how the results of Karger, Motwani, and Sudan and Blum can be combined in a natural manner to yield an $O(n^{3/14})$ coloring of any n node 3-colorable graph. This improves on the previous best bound of $O(n^{1/4})$ colors.", "pages":"49--53", "date":"1997-01", "ps":"http://people.csail.mit.edu/karger/Papers/n314color.ps", "author":["Blum, Avrim","Karger, David R."], "volume":"61", "month":"January", "cat":"Theory", "journal":"Information Processing Letters", "key":"Karger:Coloring2", "year":"1997", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#Improved%20approximation%20for%20graph%20coloring" }, {"id":"Attendee-Sourcing: Exploring The Design Space of Community-Informed Conference Scheduling", "label":"Attendee-Sourcing: Exploring The Design Space of Community-Informed Conference Scheduling", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:10b9e0a0df792b39f3572e95435d4012", "modified":"no", "crossref":"DBLP:conf/hcomp/2014", "date":"2014-11", "author":["Bhardwaj, Anant P.","Kim, Juho","Dow, Steven","Karger, David R.","Madden, Sam","Miller, Rob","Zhang, Haoqi"], "url":"http://www.aaai.org/ocs/index.php/HCOMP/HCOMP14/paper/view/8974", "biburl":"http://dblp.uni-trier.de/rec/bib/conf/hcomp/BhardwajKDKMMZ14", "bibsource":"dblp computer science bibliography, http://dblp.org", "venue":"HCOMP", "month":"November", "key":"Karger:Cobi", "year":"2014", "pdf":"http://people.csail.mit.edu/anantb/files/research/confer/confer-hcomp-14.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the Second {AAAI} Conference on Human Computation and Crowdsourcing, {HCOMP} 2014", "origin":"http://service.simile-widgets.org/babel/preview#Attendee-Sourcing%3A%20Exploring%20The%20Design%20Space%20of%20Community-Informed%20Conference%20Scheduling" }, {"id":"Minimum Cuts in Near-Linear Time", "label":"Minimum Cuts in Near-Linear Time", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e572a43ebb5438b451d825d440ff32e6", "modified":"no", "note":"Journal version appears in Journal of the ACM 47(1)", "crossref":"Proceedings of the {$28^{th}$} {ACM} Symposium on Theory of Computing", "place":"Philadelphia, PA", "abstract":" We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a \"semiduality\" between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized (Monte Carlo) algorithm that finds a minimum cut in an m-edge, n-vertex graph with high probability in O(m log3 n) time. We also give a simpler randomized algorithm that finds all minimum cuts with high probability in O(m log3 n) time. This variant has an optimal RNC parallelization. Both variants improve on the previous best time bound of O(n2 log3 n). Other applications of the tree-packing approach are new, nearly tight bounds on the number of near-minimum cuts a graph may have and a new data structure for representing them in a space-efficient manner. ", "pages":"56--63", "date":"1996-05", "ps":"http://people.csail.mit.edu/karger/Papers/lincut.ps", "author":"Karger, David R.", "editor":"Gary Miller", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:Lincut-Conf", "year":"1996", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$28^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Minimum%20Cuts%20in%20Near-Linear%20Time" }, {"id":"Automatic Layout of Structured Hierarchical Reports", "label":"Automatic Layout of Structured Hierarchical Reports", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:85e957b49578b4c2c9e00833876e6f1d", "modified":"no", "pages":"2586--2595", "date":"2013-10", "author":["Bakke, Eirik","Karger, David R.","Miller, Robert C."], "volume":"19", "venue":"InfoViz", "publisher":"IEEE", "date_0":"2013-10", "month":"October", "cat":["CHI","Databases","Visualization"], "journal":"InfoViz 2013/IEEE Transactions on Visualization and Computer Graphics", "key":"Karger:HierarchicalLayout", "year":"2013", "pub-type":"article", "number":"12", "origin":"http://service.simile-widgets.org/babel/preview#Automatic%20Layout%20of%20Structured%20Hierarchical%20Reports" }, {"id":"The Benefits of Network Coding over Routing in a Randomized Setting", "label":"The Benefits of Network Coding over Routing in a Randomized Setting", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:14647e68730515ecbe404c2279d0f0b8", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"IEEE International Symposium on Information Theory", "abstract":"We present a novel randomized network coding approach for robust, distributed transmission and compression of information in networks, and demonstrate its advantages over routing-based approaches.", "date":"2003-06", "author":["Ho, Tracey","Koetter, Ralf","M\\'{e}dard, Muriel","Karger, David R.","Effros, Michelle"], "venue":"ISIT", "month":"June", "cat":["Theory","Applications of Theory","Coding","Cuts and Flows"], "key":"Karger:NetworkCoding2", "year":"2003", "pdf":"http://www.its.caltech.edu/~tho/isit03-1.pdf", "pub-type":"inproceedings", "confurl":"http://www.isit2003.org/", "booktitle":"IEEE International Symposium on Information Theory", "organization":"IEEE", "address":"Yokohama, Japan", "origin":"http://service.simile-widgets.org/babel/preview#The%20Benefits%20of%20Network%20Coding%20over%20Routing%20in%20a%20Randomized%20Setting" }, {"id":"Byzantine Modification Detection in Multicast Networks Using Randomized Network Coding", "label":"Byzantine Modification Detection in Multicast Networks Using Randomized Network Coding", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:9cb3652bd75e5dd3fad57db125a4938e", "modified":"no", "abstract":"Distributed randomized network coding, a robust approach to multicasting in distributed network settings, can be extended to provide Byzantine modification detection without the use of cryptographic functions is presented in this paper.", "date":"2004", "author":["Ho, Tracey","Leong, B.","Koetter, Ralf","M\\'{e}dard, Muriel","Effros, Michelle","Karger, David R."], "venue":"ISIT", "cat":["Theory","Coding"], "key":"Ho:Byzantine", "year":"2004", "pdf":"http://www.its.caltech.edu/~tho/isit04.ps", "pub-type":"inproceedings", "booktitle":"International Symposium on Information Theory (ISIT)", "origin":"http://service.simile-widgets.org/babel/preview#Byzantine%20Modification%20Detection%20in%20Multicast%20Networks%20Using%20Randomized%20Network%20Coding" }, {"id":"Network Coding from a Network Flow Perspective", "label":"Network Coding from a Network Flow Perspective", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:300546a82b635156f370ddf9747b1cfb", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"IEEE International Symposium on Information Theory", "abstract":"We make precise connections between algebraic network coding and network flows. Our combinatorial formulations offer new insights, mathe- matical simplicity, and lead to a substantially tighter upper bound on the coding field size required for a given connection problem than that in [5].", "date":"2003-06", "author":["Ho, Tracey","Karger, David R.","M\\'{e}dard, Muriel","Koetter, Ralf"], "venue":"ISIT", "month":"June", "cat":["Theory","Applications of Theory","Coding","Cuts and Flows"], "key":"Karger:NetworkCoding", "year":"2003", "pdf":"http://www.its.caltech.edu/~tho/isit03-2.pdf", "pub-type":"inproceedings", "confurl":"http://www.isit2003.org/", "booktitle":"IEEE International Symposium on Information Theory", "organization":"IEEE", "address":"Yokohama, Japan", "origin":"http://service.simile-widgets.org/babel/preview#Network%20Coding%20from%20a%20Network%20Flow%20Perspective" }, {"id":"An Experimental Study of Polylogarithmic Fully-Dynamic Connectivity Algorithms", "label":"An Experimental Study of Polylogarithmic Fully-Dynamic Connectivity Algorithms", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:79ab249011300fd14715daf3ae9f6ffd", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"We present an experimental study of different variants of the amortized O(log2 n)-time fully-dynamic connectivity algorithm of Holm, de Lichtenberg, and Thorup (STOC'98). The experiments build upon experiments provided by Alberts, Cattaneo, and Italiano (SODA'96) on the randomized amortized O(log3 n) fully-dynamic connectivity algorithm of Henzinger and King (STOC'95). Our experiments shed light upon similarities and differences between the two algorithms. We also present a slightly modified version of the Henzinger-King algorithm that runs in O(log2 n) time, which resulted from our experiments. ", "date":"2001", "author":["Iyer, Raj D.","Karger, David R.","Rahul, Hariharan","Thorup, Mikkel"], "cat":"Theory", "journal":"ACM Journal of Experimental Algorithmics", "key":"Karger:ImpConn", "year":"2001", "brag":"Special issue of papers selected from ALENEX 2000", "vol":"6", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#An%20Experimental%20Study%20of%20Polylogarithmic%20Fully-Dynamic%20Connectivity%20Algorithms" }, {"id":"Randomized approximation schemes for cuts and flows in capacitated graphs", "label":"Randomized approximation schemes for cuts and flows in capacitated graphs", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:29934200be2862bc5c6750eb1103b39a", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "pages":"290--319", "date":"2015", "author":["Bencz{\\'u}r, Andr{\\'a}s A.","Karger, David R."], "volume":"44", "publisher":"Society for Industrial and Applied Mathematics", "cat":["Theory","Cuts and Flows"], "journal":"SIAM Journal on Computing", "key":"Karger:Stcut", "year":"2015", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#Randomized%20approximation%20schemes%20for%20cuts%20and%20flows%20in%20capacitated%20graphs" }, {"id":"Iterative Learning for Reliable Crowd-sourcing Systems", "label":"Iterative Learning for Reliable Crowd-sourcing Systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1183a01d966a2e54b6cb7b528a559132", "modified":"no", "date":"2011-12", "author":["Karger, David R.","Oh, Sewoong","Shah, Devavrat"], "venue":"NIPS", "month":"December", "cat":["Applications of Theory","Coding","Machine Learning","Theory","Crowdsourcing"], "key":"Karger:TurkBudget", "year":"2011", "pub-type":"inproceedings", "booktitle":"$25^{th}$ Annual Conference on Neural Information Processing Systems (NIPS)", "origin":"http://service.simile-widgets.org/babel/preview#Iterative%20Learning%20for%20Reliable%20Crowd-sourcing%20Systems" }, {"id":"Implementing a Fully Polynomial Time Approximation Scheme for All Terminal Network Reliability", "label":"Implementing a Fully Polynomial Time Approximation Scheme for All Terminal Network Reliability", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:36d0359946c0505174c622d5f9f819ed", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$8^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"New Orleans, LA", "abstract":"The classic all-terminal network reliability problem posits a graph, each of whose edges fails (disappears) independently with some given probability. The goal is to determine the probability that the network becomes disconnected due to edge failures. The practical applications of this question to communication networks are obvious, and the problem has therefore been the subject of a great deal of study. Since it is tjp-complete, and thus believed hard to solve exactly, a great deal of research has been devoted to estimating the failure probability. A comprehensive survey can be found in [Co187]. The first author recently presented an algorithm for approximating the probability of network disconnection under random edge failures. In this paper, we report on our experience implementing this algorithm. Our implementation shows that the algorithm is practical on networks of moderate size, and indeed works better than the theoretical bounds predict. Part of this improvement arises from heuristic modifications to the theoretical algorithm, while another part suggests that the theoretical running time analysis of the algorithm might not be tight. Based on our observation of the implementation, we were able to devise analytic explanations of at least some of the improved performance. As one example, we formally prove the accuracy of a simple heuristic approximation for the reliability. We also discuss other questions raised by the implementation which might be susceptible to analysis. ", "pages":"334--343", "date":"1997-01", "ps":"Papers/imprel.ps", "author":["Karger, David R.","Tai, Ray P."], "editor":"Michael Saks", "venue":"SODA", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:ImpRel", "year":"1997", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$8^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "psgz":"Papers/imprel.ps.gz", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Implementing%20a%20Fully%20Polynomial%20Time%20Approximation%20Scheme%20for%20All%20Terminal%20Network%20Reliability" }, {"id":"Polynomial Time Approximation Schemes for Dense Instances of {$\\NP$}-Hard Problems", "label":"Polynomial Time Approximation Schemes for Dense Instances of {$\\NP$}-Hard Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:9376c4ae6954ee4c58ef4e52315b8cfb", "modified":"no", "note":"Journal version appears in Journal of Computer and System Sciences", "crossref":"Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "place":"Las Vegas, NV", "pages":"284--293", "date":"1995-05", "ps":"http://people.csail.mit.edu/karger/Papers/dense.ps", "author":["Arora, Sanjeev","Karger, David R.","Karpinski, Marek"], "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":"Theory", "key":"Karger:Dense-Conf", "year":"1995", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Polynomial%20Time%20Approximation%20Schemes%20for%20Dense%20Instances%20of%20%7B%24%5CNP%24%7D-Hard%20Problems" }, {"id":"Exhibit: Lightweight Structured Data Publishing", "label":"Exhibit: Lightweight Structured Data Publishing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4dede623420e39886a36dda0033c8e6f", "modified":"no", "abstract":"It is no surprise that Semantic Web researchers and enthusiasts are excited to publish and accumulate semi-structured data on the Web. Beyond our community, however, we see many authors with structured data who want to publish it in rich browsing interfaces. These small-time authors are similar to early enthusiasts of the Web, simply excited by the opportunity to use a new medium to share information that they care about. For these users, we propose Exhibit, a lightweight structured data publishing framework that duplicates many of the desirable properties that contributed to the original growth of the Web. We argue that appealing to this segment of the Web population?addressing their publishing needs at very low cost?lets us leverage their labor to put structure on content that otherwise would be published in hand-authored HTML, and thus very hard to harvest automatically.", "pages":"737--746", "date":"2007-05", "author":["Huynh, David","Miller, Robert","Karger, David R."], "doi":"http://dx.doi.org/10.1145/1242572.1242672", "pdfkb":"2460", "venue":"WWW", "month":"May", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Exhibit", "year":"2007", "pdf":"http://people.csail.mit.edu/dfhuynh/research/papers/www2007-exhibit.pdf", "origin":"http://people.csail.mit.edu/dfhuynh/publications.html#Exhibit%3A%20Lightweight%20Structured%20Data%20Publishing", "pub-type":"inproceedings", "project":"Exhibit", "booktitle":"WWW 2007", "projectsite":"http://simile.mit.edu/exhibit/", "location":"Banff, Alberta, Canada" }, {"id":"Internet Surveillance of Pro-drug Websites. II. Identification of Emerging Drug Use Trends.", "label":"Internet Surveillance of Pro-drug Websites. II. Identification of Emerging Drug Use Trends.", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:56e94d67cd8a75d31c432617b9dec774", "modified":"no", "note":"(abstract)", "pages":"537", "date":"2001", "author":["Boyer, Edward W.","Shih, Kai","Karger, David R.","Quang, L.","Case, P."], "volume":"39", "cat":"Information Retrieval", "journal":"Journal of Toxicology: Clinical Toxicology", "key":"Karger:Tox2", "year":"2001", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Internet%20Surveillance%20of%20Pro-drug%20Websites.%20II.%20Identification%20of%20Emerging%20Drug%20Use%20Trends." }, {"id":"Less is more: probabilistic models for retrieving fewer relevant documents", "label":"Less is more: probabilistic models for retrieving fewer relevant documents", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:277aa474deabacdd44e806eef5f15577", "modified":"no", "crossref":"$29^{th}$ Internationl ACM SIGIR Conference", "abstract":"Traditionally, information retrieval systems aim to maximize the number of relevant documents returned to a user within some window of the top. For that goal, the probability ranking principle, which ranks documents in decreasing order of probability of relevance, is provably optimal. However, there are many scenarios in which that ranking does not optimize for the user's information need. One example is when the user would be satisfied with some limited number of relevant documents, rather than needing all relevant documents. We show that in such a scenario, an attempt to return many relevant documents can actually reduce the chances of finding any relevant documents. We consider a number of information retrieval metrics from the literature, including the rank of the first relevant result, the %no metric that penalizes a system only for retrieving no relevant results near the top, and the diversity of retrieved results when queries have multiple interpretations. We observe that given a probabilistic model of relevance, it is appropriate to rank so as to directly optimize these metrics in expectation. While doing so may be computationally intractable, we show that a simple greedy optimization algorithm that approximately optimizes the given objectives produces rankings for TREC queries that outperform the standard approach based on the probability ranking principle.", "pages":"429-436", "date":"2006-07", "author":["Chen, Harr","Karger, David R."], "venue":"SIGIR", "publisher":"ACM", "month":"July", "cat":"Information Retrieval", "key":"Karger:LessIsMore", "year":"2006", "pdf":"http://www.csail.mit.edu/~harr/papers/sigir2006.pdf", "pub-type":"inproceedings", "confurl":"http://www.sigir.org/sigir2006/", "booktitle":"$29^{th}$ Internationl ACM SIGIR Conference", "organization":"ACM SIGIR", "origin":"http://service.simile-widgets.org/babel/preview#Less%20is%20more%3A%20probabilistic%20models%20for%20retrieving%20fewer%20relevant%20documents" }, {"id":"A Random Linear Network Coding Approach to Multicast", "label":"A Random Linear Network Coding Approach to Multicast", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:68a7350e5bd5d751313918f2718f00d2", "modified":"no", "abstract":"We present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. Network nodes independently and randomly select linear mappings from inputs onto output links over some field. We show that this achieves capacity with probability exponentially approaching 1 with the code length. We also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian-Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. We show that this approach can take advantage of redundant network capacity for improved success probability and robustness. We illustrate some potential advantages of random linear network coding over routing in two examples of practical scenarios: distributed network operation and networks with dynamically varying connections. Our derivation of these results also yields a new bound on required field size for centralized network coding on general multicast networks.", "pages":" 4413-4430", "date":"2006", "author":["Ho, Tracey","M\\'{e}dard, Muriel","Koetter, Ralf","Karger, David R.","Effros, Michelle","Shi, J.","Leong, B."], "volume":"52", "cat":["Theory","Coding","Cuts and Flows"], "journal":" IEEE Transactions on Information Theory", "key":"Ho:RandomLinearCoding", "year":"2006", "pdf":"http://www.its.caltech.edu/~tho/itrandom-final.pdf", "pub-type":"article", "number":"10", "origin":"http://service.simile-widgets.org/babel/preview#A%20Random%20Linear%20Network%20Coding%20Approach%20to%20Multicast" }, {"id":"Fast Connected Components Algorithms for the {EREW} {PRAM}", "label":"Fast Connected Components Algorithms for the {EREW} {PRAM}", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d8c5f95d8c06da2b465368bd672ea20d", "modified":"no", "crossref":"Proceedings of the {$4^{th}$} Annual {ACM}-{SIAM} Symposium on Parallel Algorithms and Architectures", "place":"San Diega, CA", "abstract":" We present fast and efficient parallel algorithms for finding the connected components of an undirected graph. These algorithms run on the exclusive-read, exclusive-write (EREW) PRAM. On a graph with n vertices and m edges, our randomized algorithm runs in O(log n) time using $(m+n^{1+\\epsilon})/\\log n$ EREW processors (for any fixed $\\epsilon>0$). A variant uses (m+n)/log n processors and runs in O(log n log log n) time. A deterministic version of the algorithm runs in $O(\\log^{1.5}n)$ time using m+n EREW processors. ", "pages":"562-572", "date":"1992-06", "ps":"http://people.csail.mit.edu/karger/Papers/conn-components.ps", "author":["Karger, David R.","Nisan, Noam","Parnas, Michal"], "venue":"SPAA", "month":"June", "cat":"Theory", "key":"Karger:Connectivity-Conf", "year":"1992", "brag":"Journal version appears in SIAM Journal on Computing 28(3)", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$4^{th}$} Annual {ACM}-{SIAM} Symposium on Parallel Algorithms and Architectures", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Fast%20Connected%20Components%20Algorithms%20for%20the%20%7BEREW%7D%20%7BPRAM%7D" }, {"id":"Using URLs and Table Layout for Web Classification Tasks", "label":"Using URLs and Table Layout for Web Classification Tasks", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e2a7bddd445ab79b9584f4f35d99f8a6", "modified":"no", "crossref":"Proceedings of the $13^{th}$ International World Wide Web Conference", "abstract":" We propose new features and algorithms for automating Web-page classification tasks such as content recommendation and ad blocking. We show that the automated classification of Web pages can be much improved if, instead of looking at their textual content, we consider each links's URL and the visual placement of those links on a referring page. These features are unusual: rather than being scalar measurements like word counts they are \\emph{tree structured}---describing the position of the item in a tree. We develop a model and algorithm for machine learning using such tree-structured features. We apply our methods in automated tools for recognizing and blocking Web advertisements and for recommending ``interesting'' news stories to a reader. Experiments show that our algorithms are both faster and more accurate than those based on the text content of Web documents.", "date":"2004-05", "author":["Shih, Kai","Karger, David R."], "venue":"WWW", "month":"May", "cat":["Information Retrieval","CHI","Machine Learning"], "key":"Karger:TreeLearning", "year":"2004", "pdf":"http://www2004.org/proceedings/docs/1p193.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the $13^{th}$ International World Wide Web Conference", "origin":"http://service.simile-widgets.org/babel/preview#Using%20URLs%20and%20Table%20Layout%20for%20Web%20Classification%20Tasks" }, {"id":"Dynamic Graph Algorithms with Applications", "label":"Dynamic Graph Algorithms with Applications", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:13ceba15df7a246c3d3c37c512ebddf9", "modified":"no", "abstract":" First we review amortized fully-dynamic polylogarithmic algorithms for connectivity, minimum spanning trees (MST), 2-edge- and biconnectivity. Second we discuss how they yield improved static algorithms: connectivity for constructing a tree from homeomorphic subtrees, 2-edge connectivity for finding unique matchings in graphs, and MST for packing spanning trees in graphs. The application of MST for spanning tree packing is new and when boot-strapped, it yields a fully-dynamic polylogarithmic algorithm for approximating general edge connectivity within a factor ष{2+o(1)}. Finally, on the more practical side, we will discuss how output sensitive algorithms for dynamic shortest paths have been applied successfully to speed up local search algorithms for improving routing on the internet, roughly doubling the capacity.", "date":"2000-07", "author":["Thorup, Mikkel","Karger, David R."], "venue":"SWAT", "month":"July", "cat":["Theory","Cuts and Flows"], "key":"Karger:DynamicCut", "year":"2000", "pub-type":"inproceedings", "booktitle":"Seventh Scandinavian Workshop on Algorithm Theory", "origin":"http://service.simile-widgets.org/babel/preview#Dynamic%20Graph%20Algorithms%20with%20Applications" }, {"id":"Augmenting Undirected Edge Connectivity in {$\\Olog(n^2)$} Time", "label":"Augmenting Undirected Edge Connectivity in {$\\Olog(n^2)$} Time", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cd31f031e2c2295d3768d7fd679269de", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"We give improved randomized (Monte Carlo) algorithms for undirected edge splitting and edge connectivity augmentation problems. Our algorithms run in time ~ O(n^2) on n-vertex graphs, making them an ~\\Omega(m/n) factor faster than the best known deterministic ones on m-edge graphs.", "pages":"2--36", "date":"2000", "ps":"http://people.csail.mit.edu/karger/Papers/augment-journal.ps", "author":["Bencz{\\'u}r, Andr{\\'a}s A.","Karger, David R."], "volume":"37", "cat":["Theory","Cuts and Flows"], "journal":"Journal of Algorithms", "key":"Karger:Augmentation", "year":"2000", "brag":"Special issue of selected papers from Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Augmenting%20Undirected%20Edge%20Connectivity%20in%20%7B%24%5COlog(n%5E2)%24%7D%20Time" }, {"id":"Approximate $s$--$t$ Min-Cuts in {$\\Olog(n^2)$} Time", "label":"Approximate $s$--$t$ Min-Cuts in {$\\Olog(n^2)$} Time", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:3e33b954a6c491ad7f908cdd32fd0eed", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$28^{th}$} {ACM} Symposium on Theory of Computing", "place":"Philadelphia, PA", "pages":"47--55", "date":"1996-05", "ps":"http://people.csail.mit.edu/karger/Papers/stcut.ps", "author":["Bencz{\\'u}r, Andr{\\'a}s A.","Karger, David R."], "editor":"Gary Miller", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:Stcut-conf", "year":"1996", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$28^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Approximate%20%24s%24--%24t%24%20Min-Cuts%20in%20%7B%24%5COlog(n%5E2)%24%7D%20Time" }, {"id":"Content Modelling Using Latent Permutations", "label":"Content Modelling Using Latent Permutations", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:45f1bdd36588e4948f8b7ec5d0d25423", "modified":"no", "abstract":"We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be effectively represented using a distribution over permutations called the Generalized Mallows Model. We apply our method to three complementary discourse-level tasks: cross-document alignment, document segmentation, and information ordering. Our experiments show that incorporating our permutation-based model in these applications yields substantial improvements in performance over previously proposed methods.", "pages":"129--163", "date":"2009-09", "author":["Chen, Harr","Branavan, S.R.K.","Barzilay, Regina","Karger, David R."], "doi":"http://dx.doi.org/10.1613/jair.2830", "volume":"36", "month":"September", "cat":["Machine Learning","Information Retrieval"], "journal":"Journal of Artificial Intelligence Research", "key":"Karger:Mallows-JAIR", "year":"2009", "pdf":"http://www.jair.org/media/2830/live-2830-4684-jair.pdf", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Content%20Modelling%20Using%20Latent%20Permutations" }, {"id":"Fast augmenting paths by random sampling from residual graphs", "label":"Fast augmenting paths by random sampling from residual graphs", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6d5d20057cb45e78416c678408c3cf0b", "modified":"no", "pages":"320--339", "date":"2015", "author":["Karger, David R.","Levine, Matthew S."], "volume":"44", "publisher":"Society for Industrial and Applied Mathematics", "cat":["Theory","Cuts and Flows"], "journal":"SIAM Journal on Computing", "key":"Karger:AugmentingPath-Journal", "year":"2015", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#Fast%20augmenting%20paths%20by%20random%20sampling%20from%20residual%20graphs" }, {"id":"(De)Randomized Construction of Small Sample Spaces in {$\\NC$}", "label":"(De)Randomized Construction of Small Sample Spaces in {$\\NC$}", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:7f6000576c457674d0369c13bcbe016b", "modified":"no", "note":"Journal version appears in Journal of Computer and System Sciences 55", "crossref":"Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "place":"Santa Fe, NM", "abstract":"Koller and Megiddo introduced the paradigm of constructing compact distributions that satisfy a given set of constraints and showed how it can be used to efficiently derandomize certain types of algorithms. In this paper, we significantly extend their results in two ways. First, we show how their approach can be applied to deal with more general expectation constraints. More importantly, we provide the first parallel (NC) algorithm for constructing a compact distribution that satisfies the constraintsup to a small relative error. This algorithm deals with constraints over any event that can be verified by finite automata, including all independence constraints as well as constraints over events relating to the parity or sum of a certain set of variables. Our construction relies on a new and independently interesting parallel algorithm for converting a solution to a linear system into an almost basic approximate solution to the same system. We use these techniques in the first NC derandomization of analgorithm for constructing large independent sets in d-uniform hypergraphs for arbitrary d. We also show how the linear programming perspective suggests new proof techniques which might be useful in general probabilistic analysis.", "pages":"252--263", "date":"1994-11", "ps":"http://people.csail.mit.edu/karger/Papers/proc-random.ps", "author":["Karger, David R.","Koller, Daphne"], "editor":"Shafi Goldwasser", "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"November", "cat":"Theory", "key":"Karger:Random-Conf", "year":"1994", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#(De)Randomized%20Construction%20of%20Small%20Sample%20Spaces%20in%20%7B%24%5CNC%24%7D" }, {"id":"LP Decoding", "label":"LP Decoding", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a157484df3660642ebd2fe40afd3a991", "modified":"no", "abstract":"Linear programming (LP) relaxation is a common technique used to find good solutions to complex optimization problems. We present the method of ``LP decoding'': applying LP relaxation to the problem of maximum-likelihood (ML) decoding. An arbitrary binary-input memoryless channel is considered. This treatment of the LP decoding method places our previous work on turbo codes [6] and low-density parity-check (LDPC) codes [8] into a generic framework. We define the notion of a proper relaxation, and show that any LP decoder that uses a proper relaxation exhibits many useful properties. We describe the notion of pseudocodewords under LP decoding, unifying many known characterizations for specific codes and channels. The fractional distance of an LP decoder is defined, and it is shown that LP decoders correct a number of errors equal to half the fractional distance. We also discuss the application of LP decoding to binary linear codes. We define the notion of a relaxation being symmetric for a binary linear code. We show that if a relaxation is symmetric, one may assume that the all-zeros codeword is transmitted.", "date":"2003-10", "author":["Feldman, Jon","Karger, David R.","Wainwright, Martin J."], "venue":"Allerton", "month":"October", "cat":["Theory","Coding"], "key":"Karger:LPDecoding", "year":"2003", "pdf":"Papers/allerton03.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the 41st Annual Allerton Conference on Communication, Control, and Computing", "psgz":"Papers/allerton03.ps.gz", "origin":"http://service.simile-widgets.org/babel/preview#LP%20Decoding" }, {"id":"Management of personal information scraps", "label":"Management of personal information scraps", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:87d718440920155cf9c80bc9ead9fcef", "modified":"no", "note":"Poster.", "abstract":"We introduce research on information scraps. short, self-contained personal notes that fall outside of traditional filing schemes. We report on a preliminary study of information scraps. nature and outline plans for the next phase of our user study. Based on ongoing study results, we describe our designs and prototypes for information scrap capture and access tools.", "pages":"2285-2290", "date":"2007-05", "author":["Bernstein, Michael S.","Van Kleek, Max","schraefel, m. c.","Karger, David R."], "venue":"CHI", "month":"May", "cat":"CHI", "key":"Karger:Doing", "year":"2007", "pdf":"http://people.csail.mit.edu/emax/papers/infoscraps_WIP_chi2007.pdf", "pub-type":["misc","poster"], "booktitle":"CHI Extended Abstracts", "origin":"http://service.simile-widgets.org/babel/preview#Management%20of%20personal%20information%20scraps" }, {"id":"Random Sampling in Matroids, with Applications to Graph Connectivity and Minimum Spanning Trees", "label":"Random Sampling in Matroids, with Applications to Graph Connectivity and Minimum Spanning Trees", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:52c2fbea52736e6cd92773af5b27a695", "modified":"no", "note":"Journal version appears in Mathematical Programming B 82(1--2)", "crossref":"Proceedings of the {$34^{th}$} Annual Symposium on the Foundations of Computer Science", "place":"Palo Alto, CA", "abstract":"Random sampling is a powerful tool for gathering information about a group by considering only a small part of it. We discuss some broadly applicable paradigms for using random sampling in combinatorial optimization and demonstrate the effectiveness of these paradigms for two optimization problems on matroids: Finding an optimum matroid basis and packing disjoint matroid bases. Applications of these ideas to the graphic matroid led to fast algorithms for minimum spanning trees and minimum cuts. An optimum matroid basis is typically found by a greedy algorithm that grows an independent set into an the optimum basis one element at a time. This continuous change in the independent set can make it hard to perform the independence tests needed by the greedy algorithm. We simplify matters by using sampling to reduce the problem of finding an optimum matroid basis to the problem of verifying that a given fixed basis is optimum, showing that the two problems can be solved in roughly the same time. Another application of sampling is to packing matroid bases, also known as matroid partitioning. Sampling reduces the number of bases that must be packed. We combine sampling with a greedy packing strategy that reduces the size of the matroid. Together, these techniques give accelerated packing algorithms. We give particular attention to the problem of packing spanning trees in graphs, which has applications in network reliability analysis. Our results can be seen as generalizing certain results from random graph theory. The techniques have also been effective for other packing problems.", "pages":"84--93", "date":"1993-11", "ps":"http://people.csail.mit.edu/karger/Papers/matroid.ps", "author":"Karger, David R.", "editor":"Leonidas Guibas", "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"November", "cat":["Theory","Cuts and Flows"], "key":"Karger:Matroid-Conf", "year":"1993", "pdf":"http://people.csail.mit.edu/karger/Papers/matroid.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$34^{th}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#Random%20Sampling%20in%20Matroids%2C%20with%20Applications%20to%20Graph%20Connectivity%20and%20Minimum%20Spanning%20Trees" }, {"id":"On Approximating the Longest Path in a Graph", "label":"On Approximating the Longest Path in a Graph", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e97bec05f06675ea8aa03bce036c7ee9", "modified":"no", "note":"A preliminary version appeared in the 1993 Workshop on Algorithms and Data Structures", "abstract":"We consider the problem of approximating the longest path in undirected graphs. In an attempt to pin down the best achievable performance ratio of an approximation algorithm for this problem, we present both positive and negative results. First, a simple greedy algorithm is shown to find long paths in dense graphs. We then consider the problem of finding paths in graphs that are guaranteed to have extremely long paths. We devise an algorithm that finds paths of a logarithmic length in Hamiltonian graphs. This algorithm works for a much larger class of graphs (weakly Hamiltonian), where the result is the best possible. Since the hard case appears to be that of sparse graphs, we also consider sparse random graphs. Here we show that a relatively long path can be obtained, thereby partially answering an open problem of Broderet al. To explain the difficulty of obtaining better approximations, we also prove hardness results. We show that, for any e<1, the problem of finding a path of lengthn-n e in ann-vertex Hamiltonian graph isNP-hard. We then show that no polynomial-time algorithm can find a constant factor approximation to the longest-path problem unlessP=NP. We conjecture that the result can be strengthened to say that, for some constant d>0, finding an approximation of ration d is alsoNP-hard. As evidence toward this conjecture, we show that if any polynomial-time algorithm can approximate the longest path to a ratio of $$2^{O(\\log ^{1 - \\varepsilon } n)} $$ , for any e>0, thenNP has a quasi-polynomial deterministic time simulation. The hardness results apply even to the special case where the input consists of bounded degree graphs. ", "pages":"82--98", "date":"1997-05", "ps":"http://people.csail.mit.edu/karger/Papers/hamilton.ps", "author":["Karger, David R.","Ramkumar, G. D. S.","Motwani, Rajeev"], "volume":"18", "month":"May", "cat":"Theory", "journal":"Algorithmica", "key":"Karger:Hamilton", "year":"1997", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#On%20Approximating%20the%20Longest%20Path%20in%20a%20Graph" }, {"id":" {U-REST}: an unsupervised record extraction system", "label":" {U-REST}: an unsupervised record extraction system", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:428d90f93f50bd3cec5734420e741e64", "modified":"no", "note":"Poster.", "crossref":"16th International World Wide Web Conference (WWW2007)", "abstract":"We demonstrate a system that extracts record sets from record-list web pages with no direct human supervision. Our system, U-REST, reframes the problem of unsupervised record extraction as a two-phase machine learning problem with a clustering phase, where structurally similar regions are discovered, and a record cluster detection phase, where discovered grouping of regions are ranked by their likelihood of being records. This framework simplifies the record extraction task, and allows for independent analysis of the algorithms and the underlying features. In our work, we survey a large set of features under this simplified framework. We conclude with an preliminary comparison of U-REST against similar systems and show improvements in the extraction accuracy.", "pages":"1347-1348", "date":"2007-05", "author":["Shen, Yuan Kui","Karger, David R."], "venue":"WWW", "month":"May", "cat":["Machine Learning","Information Retrieval"], "key":"Karger:UnsupervisedRecords", "year":"2007", "pdf":"https://www2007.cpsc.ucalgary.ca/posters/poster1005.pdf", "pub-type":["misc","poster"], "booktitle":"16th International World Wide Web Conference (WWW2007)", "location":"Banff, Canada", "origin":"http://service.simile-widgets.org/babel/preview#%20%7BU-REST%7D%3A%20an%20unsupervised%20record%20extraction%20system" }, {"id":"The Perfect Search Engine is not Enough: a Study of Orienteering Behavior in Directed Search", "label":"The Perfect Search Engine is not Enough: a Study of Orienteering Behavior in Directed Search", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8bb5ff650c2a024564cb36ad3fe117ec", "modified":"no", "abstract":" This paper presents a modified diary study that investigated how people performed personally motivated searches in their email, in their files, and on the Web. Although earlier studies of directed search focused on keyword search, most of the search behavior we observed did not involve keyword search. Instead of jumping directly to their information target using keywords, our participants navigated to their target with small, local steps using their contextual knowledge as a guide, even when they knew exactly what they were looking for in advance. This stepping behavior was especially common for participants with unstructured information organization. The observed advantages of searching by taking small steps include that it allowed users to specify less of their information need and provided a context in which to understand their results. We discuss the implications of such advantages for the design of personal information management tools. ", "pages":"415--422", "date":"2004", "author":["Teevan, Jaime","Alvarado, Christine","Ackerman, Mark S.","Karger, David R."], "doi":"http://doi.acm.org/10.1145/985692.985745", "venue":"CHI", "publisher":"ACM Press", "cat":["Information Retrieval","CHI","Ethnography"], "key":"Karger:Orienteering", "year":"2004", "isbn":"1-58113-702-8", "pdf":"http://people.csail.mit.edu/teevan/work/publications/papers/chi04.pdf", "pub-type":"inproceedings", "booktitle":"CHI '04: Proceedings of the SIGCHI conference on Human factors in computing systems", "location":"Vienna, Austria", "origin":"http://service.simile-widgets.org/babel/preview#The%20Perfect%20Search%20Engine%20is%20not%20Enough%3A%20a%20Study%20of%20Orienteering%20Behavior%20in%20Directed%20Search" }, {"id":"Budget-Optimal Crowdsourcing using Low-rank Matrix Approximations", "label":"Budget-Optimal Crowdsourcing using Low-rank Matrix Approximations", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:b6661aa6dc4a968e1fcc2deda0fe1aa9", "modified":"no", "pages":"284-291", "date":"2011-09", "author":["Karger, David R.","Oh, Sewoong","Shah, Devavrat"], "venue":"Allerton", "month":"September", "cat":["Applications of Theory","Coding","Machine Learning","Theory","Crowdsourcing"], "key":"Karger:TurkBudgetMatrix", "year":"2011", "pdf":"https://netfiles.uiuc.edu/swoh/www/paper_crowdsourcing_allerton.pdf", "pub-type":"inproceedings", "booktitle":"$49^th$ Annual Conference on Communication, Control, and Computing (Allerton)", "origin":"http://service.simile-widgets.org/babel/preview#Budget-Optimal%20Crowdsourcing%20using%20Low-rank%20Matrix%20Approximations" }, {"id":"Eyebrowse: real-time web activity sharing and visualization", "label":"Eyebrowse: real-time web activity sharing and visualization", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:db144388300f4884a1c1eb814579dbb3", "modified":"no", "note":"poster", "pages":"3643--3648", "date":"2010-04", "author":["Van Kleek, Max","Moore, Brennan","Xu, Christina","Karger, David R."], "doi":"http://doi.acm.org/10.1145/1753846.1754032", "publisher":"ACM", "month":"April", "cat":"CHI", "key":"Karger:Eyebrowse-poster", "year":"2010", "isbn":"978-1-60558-930-5", "pdf":"Papers/eyebrowse_chi-wip.pdf", "pub-type":["misc","poster"], "booktitle":"CHI EA '10: Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems", "location":"Atlanta, Georgia, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Eyebrowse%3A%20real-time%20web%20activity%20sharing%20and%20visualization" }, {"id":"Near-Optimal Interprocedural Branch Alignment", "label":"Near-Optimal Interprocedural Branch Alignment", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:7ef0322f880f0ac71f31268bac1eb707", "modified":"no", "abstract":"Branch alignment reorders the basic blocks of a program to minimize pipeline penalties due to controltransfer instructions. Prior work in branch alignment has produced useful heuristic methods. We compute lower bounds on the runtime costs frompipeline penalties and present an intraprocedural branch alignment algorithm that approaches the bound. We compare the control penalties and running times of our algorithm to an older, greedy approach and observe that both the greedy method and ourmethod are close to the lower bound on control penalties, suggesting that greedy is good enough. Surprisingly, in actual execution our method produces programs that run noticeably faster than the greedy method. We also report results from training andtesting on different data sets, validating that our results can be achieved in real-world usage. Training and testing on different data sets slightly reduced the benefits from both branch alignment algorithms, but not enough to change the ranking of algorithms, and not enough to completely erase the benefits from branch alignment ", "pages":"183--193", "date":"1997-06", "author":["Young, Cliff","Johnson, David S.","Karger, David R.","Smith, Michael D."], "venue":"PLDI", "month":"June", "cat":["Systems","Applications of Theory"], "key":"Karger:TSPBranchPrediction", "year":"1997", "pub-type":"inproceedings", "booktitle":"ACM SIGPLAN Conference on Programming Language Design and Implementation", "organization":"ACM", "address":"Las Vegas, NV", "origin":"http://service.simile-widgets.org/babel/preview#Near-Optimal%20Interprocedural%20Branch%20Alignment" }, {"id":"Minimum-cost multicast over coded packet networks", "label":"Minimum-cost multicast over coded packet networks", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:ad663fffb4124a116437fb2f467002e9", "modified":"no", "abstract":"We consider the problem of establishing minimum-cost multicast connections over coded packet networks, i.e. packet networks where the contents of outgoing packets are arbitrary, causal functions of the contents of received packets. We consider both wireline and wireless packet networks as well as both static multicast (where membership of the multicast group remains constant for the duration of the connection) and dynamic multicast (where membership of the multicast group changes in time, with nodes joining and leaving the group). For static multicast, we reduce the problem to a polynomial-time solvable optimization problem, and we present decentralized algorithms for solving it. These algorithms, when coupled with existing decentralized schemes for constructing network codes, yield a fully decentralized approach for achieving minimum-cost multicast. By contrast, establishing minimum-cost static multicast connections over routed packet networks is a very difficult problem even using centralized computation, except in the special cases of unicast and broadcast connections. For dynamic multicast, we reduce the problem to a dynamic programming problem and apply the theory of dynamic programming to suggest how it may be solved.", "pages":"2608-2623", "date":"2006", "author":["Lun, Desmond S.","Ratnakar, Niranjan","M\\'{e}dard, Muriel","Koetter, Ralf","Karger, David R.","Ho, Tracey","Ahmed, Ebad","Zhao, Fang"], "volume":"52", "cat":["Theory","Coding","Cuts and Flows"], "journal":"IEEE Transactions on Information Theory", "key":"Karger:MinCostMulticast", "year":"2006", "pub-type":"article", "number":"6", "origin":"http://service.simile-widgets.org/babel/preview#Minimum-cost%20multicast%20over%20coded%20packet%20networks" }, {"id":"9942dd64dced0081d31db6406034527c", "label":"Polynomial Time Approximation Schemes for Dense Instances of {$\\NP$}-Hard Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:9942dd64dced0081d31db6406034527c", "modified":"no", "pages":"193--210", "date":"1999", "author":["Arora, Sanjeev","Karger, David R.","Karpinski, Marek"], "volume":"58", "cat":"Theory", "journal":"Journal of Computer and System Sciences", "key":"Karger:Dense", "year":"1999", "brag":"Special issue of selected papers from Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#9942dd64dced0081d31db6406034527c" }, {"id":"Finding Nearest Neighbors in Growth Restricted Metrics", "label":"Finding Nearest Neighbors in Growth Restricted Metrics", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:02e43039b29795b767f1d5abdbf0a23b", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$33^{rd}$} {ACM} Symposium on Theory of Computing", "place":"Montreal, Canada", "abstract":" Most research on nearest neighbor algorithms in the literature has been focused on the Euclidian case. In many practical search problems however, the underlying metric cannot be well approximated by an Euclidian space, and applying general purpose algorithms result in poor performance by not making use of the special properties of the subspace that the points lie in. In this paper, we develop an efficient dynamic data structure for nearest neighbor queries in growth-constrained metrics. These metrics satisfy the property that for any point $q$ the number of points within a radius $2r$ around $q$ is at most a constant times the number of points within radius $r$. Spaces of this kind occur in networking applications, such as the Internet or Peer-to-peer networks, and vector quantization applications, where feature vectors fall into low-dimensional manifolds within high-dimensional vector spaces. ", "pages":"741--750", "date":"2002-05", "ps":"Papers/neighbor.ps", "author":["Karger, David R.","Ruhl, Matthias"], "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","P2P"], "key":"Karger:Neighbor", "year":"2002", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$33^{rd}$} {ACM} Symposium on Theory of Computing", "psgz":"Papers/neighbor.ps.gz", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Finding%20Nearest%20Neighbors%20in%20Growth%20Restricted%20Metrics" }, {"id":"Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems", "label":"Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:7a3ef64585dfdd807c069b48e4a87c18", "modified":"no", "crossref":"Proceedings of the {$18^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "pages":"1207-1216", "date":"2007-01", "author":["Karger, David R.","Onak, Krzysztof"], "doi":"http://doi.acm.org/10.1145/1283383.1283513", "editor":"Nikhil Bansal and Kirk Pruhs and Clifford Stein", "venue":"SODA", "publisher":"SIAM", "month":"January", "cat":"Theory", "key":"Karger:SmoothedBinPacking", "year":"2007", "isbn":"978-0-898716-24-5", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$18^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Polynomial%20approximation%20schemes%20for%20smoothed%20and%20random%20instances%20of%20multidimensional%20packing%20problems" }, {"id":"Successful classroom deployment of a social document annotation system", "label":"Successful classroom deployment of a social document annotation system", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:213f5ffe88054f79d643dd82b8a436fe", "modified":"no", "pages":"1883-1892", "date":"2012-05", "author":["Zyto, Sacha","Karger, David R.","Ackerman, Mark S.","Mahajan, Sanjoy"], "url":"http://nb.mit.edu/", "doi":"http://dx.doi.org/10.1145/2207676.2208326", "venue":"CHI", "month":"May", "cat":["CHI","Education"], "key":"Karger:Nb", "year":"2012", "pdf":"http://people.csail.mit.edu/karger/Papers/nb.pdf", "pub-type":"inproceedings", "booktitle":"CHI Conference on Human Factors in Computing Systems", "origin":"http://service.simile-widgets.org/babel/preview#Successful%20classroom%20deployment%20of%20a%20social%20document%20annotation%20system" }, {"id":"Linear Programming-Based Decoding of Turbo-Like Codes and its Relation to Iterative Approaches", "label":"Linear Programming-Based Decoding of Turbo-Like Codes and its Relation to Iterative Approaches", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:fb3688da1c69171b84025f8af481db39", "modified":"no", "abstract":"In recent work (Feldman and Karger [8]), we introduced a new approach to decoding turbo-like codes based on linear programming (LP). We gave a precise characterization of the noise patterns that cause decoding error under the binary symmetric and additive white Gaussian noise channels. We used this characterization to prove that the word error rate is bounded by an inverse polynomial in the code length. Furthermore, for any turbo-like code, our algorithm has the ML certificate property: whenever it outputs a code word, it is guaranteed to be the maximum-likelihood (ML) code word. In this paper we extend these results and give an iterative decoder whose output is equivalent to that of the LP decoder. We also extend the MLcertificate property to the more efficient iterative tree reweighted max-product message-passing algorithm developed by Wainwright, Jaakkola, and Willsky [13]: we show that whenever this algorithm converges to a code word, it must be the ML code word. Finally, we demonstrate experimentally that the noise patterns that cause decoding error in the LP decoder also cause decoding error in the standard iterative sum-product and max-product (min-sum) message-passing algorithms. Consequently, the deterministically constructible interleaver used by the LP decoder to achieve its bounds on error rate is useful in practice not only for the LP decoder, but for these standard iterative decoders as well.", "date":"2002-10", "author":["Feldman, Jon","Karger, David R.","Wainwright, Martin J."], "venue":"Allerton", "month":"October", "cat":["Theory","Coding"], "key":"Karger:IterativeDecoding", "year":"2002", "pdf":"allerton02.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the 40th Annual Allerton Conference on Communication, Control, and Computing", "psgz":"Papers/allerton02.ps.gz", "origin":"http://service.simile-widgets.org/babel/preview#Linear%20Programming-Based%20Decoding%20of%20Turbo-Like%20Codes%20and%20its%20Relation%20to%20Iterative%20Approaches" }, {"id":"Random Sampling in Cut, Flow, and Network Design Problems", "label":"Random Sampling in Cut, Flow, and Network Design Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:5f3830e9ebf933dfc8c0955ecd1c70ed", "modified":"no", "note":"Journal version appears in Mathematics of Operation Research 24(2), 1999", "crossref":"Proceedings of the {$26^{th}$} {ACM} Symposium on Theory of Computing", "place":"Montreal, Quebec, Canada", "abstract":"We explore random sampling as a tool for solving undirected graph problems. We show that the sparse graph's or skeleton, which arises when we randomly sample a graph's edges will accurately approximate the value of all cuts in the original graph. This makes sampling effective for problems involving cuts in graphs. We apply these tools in fast randomized (Monte Carlo and Las Vegas) algorithms for approximating and exactly finding cuts and flows in an unweighted, undirected graph. We also give weighted-graph versions of these algorithms which use a form of scaling to achieve a sublinear dependence on the maximum edge weight. Our methods also reduce the work done by some parallel cut algorithms. Our sampling theorems also yield faster algorithms for several other cut based problems, including approximating the best balanced cut of a graph, finding a k-connected orientation of a 2k-connected graph, and finding integral multicommodity flows in graphs with a great deal of excess capacity. Our sampling theorems apply even when the sampling probabilities are different for different edges. Using this fact, we show how to use randomized rounding in (1+0(1))-approximation algorithm for the NP-complete minimum $k$-connected subgraph problem when $k > log n$; the best previous approximation factor was $\\log k$. Our techniques generalize to many other survivable network design problems.", "pages":"648--657", "date":"1994-05", "ps":"http://people.csail.mit.edu/karger/Papers/skeleton.ps", "author":"Karger, David R.", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:Skeleton-Conf", "year":"1994", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$26^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Random%20Sampling%20in%20Cut%2C%20Flow%2C%20and%20Network%20Design%20Problems" }, {"id":"Internet Surveillance of Pro-drug Websites. III. Identification of Emerging Drugs of Abuse.", "label":"Internet Surveillance of Pro-drug Websites. III. Identification of Emerging Drugs of Abuse.", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:bdeea9bbb578947fc07c694c13d5cba8", "modified":"no", "note":"(abstract)", "pages":"537", "date":"2001", "author":["Boyer, Edward W.","Shih, Kai","Karger, David R.","Quang, L.","Case, P."], "volume":"39", "cat":"Information Retrieval", "journal":"Journal of Toxicology: Clinical Toxicology", "key":"Karger:Tox3", "year":"2001", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Internet%20Surveillance%20of%20Pro-drug%20Websites.%20III.%20Identification%20of%20Emerging%20Drugs%20of%20Abuse." }, {"id":"Random Sampling and Greedy Sparsification in Matroid Optimization Problems", "label":"Random Sampling and Greedy Sparsification in Matroid Optimization Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1229fc9898c622a4d682e71639efb1de", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$34^{th}$} Annual Symposium on the Foundations of Computer Science", "abstract":"Random sampling is a powerful tool for gathering information about a group by considering only a small part of it. We discuss some broadly applicable paradigms for using random sampling in combinatorial optimization, and demonstrate the effectiveness of these paradigms for two optimization problems on matroids: finding an optimum matroid basis and packing disjoint matroid bases. Applications of these ideas to the graphic matroid led to fast algorithms for minimum spanning trees and minimum cuts. An optimum matroid basis is typically found by a greedy algorithm that grows an independent set into an the optimum basis one element at a time. This continuous change in the independent set can make it hard to perform the independence tests needed by the greedy algorithm. We simplify matters by using sampling to reduce the problem of finding an optimum matroid basis to the problem of verifying that a given fixed basis is optimum, showing that the two problems can be solved in roughly the same time. Another application of sampling is to packing matroid bases, also known as matroid partitioning. Sampling reduces the number of bases that must be packed. We combine sampling with a greedy packing strategy that reduces the size of the matroid. Together, these techniques give accelerated packing algorithms. We give particular attention to the problem of packing spanning trees in graphs, which has applications in network reliability analysis. Our results can be seen as generalizing certain results from random graph theory. The techniques have also been effective for other packing problems.", "pages":"41--81", "date":"1998-06", "ps":"http://people.csail.mit.edu/karger/Papers/matroid.ps", "author":"Karger, David R.", "volume":"82", "month":"June", "cat":["Theory","Cuts and Flows"], "journal":"Mathematical Programming {B}", "key":"Karger:Matroid", "year":"1998", "pub-type":"article", "number":"1--2", "origin":"http://service.simile-widgets.org/babel/preview#Random%20Sampling%20and%20Greedy%20Sparsification%20in%20Matroid%20Optimization%20Problems" }, {"id":"A Scalable Location Service for Geographic Ad-Hoc Routing", "label":"A Scalable Location Service for Geographic Ad-Hoc Routing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4f72d4cd6223e404c1d683f8e31c9c9e", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":" GLS is a new distributed location service which tracks mobile node locations. GLS combined with geographic forwarding allows the construction of ad hoc mobile networks that scale to a larger number of nodes than possible with previous work. GLS is decentralized and runs on the mobile nodes themselves, requiring no fixed infrastructure. Each mobile node periodically updates a small set of other nodes (its location servers) with its current location. A node sends its position updates to its location servers without knowing their actual identities, assisted by a predefined ordering of node identifiers and a predefined geographic hierarchy. Queries for a mobile node's location also use the predefined identifier ordering and spatial hierarchy to find a location server for that node. Experiments using the ns simulator for up to 600 mobile nodes show that the storage and bandwidth requirements of GLS grow slowly with the size of the network. Furthermore, GLS tolerates node failures well: each failure has only a limited effect and query performance degrades gracefully as nodes fail and restart. The query performance of GLS is also relatively insensitive to node speeds. Simple geographic forwarding combined with GLS compares favorably with Dynamic Source Routing (DSR): in larger networks (over 200 nodes) our approach delivers more packets, but consumes fewer network resources. ", "pages":"120--130", "date":"2000-08", "ps":"http://pdos.lcs.mit.edu/grid/grid:mobicomm00/paper.ps", "author":["Li, Jinyang","Janotti, John","Couto}, Douglas S. J. {De","Karger, David R.","Morris, Robert"], "venue":"Mobicom", "month":"August", "cat":["Systems","P2P","Applications of Theory"], "key":"Karger:Grid", "year":"2000", "ppt":"http://pdos.lcs.mit.edu/grid/mobicom_presentation.ppt", "pdf":"http://pdos.lcs.mit.edu/grid/grid:mobicomm00/paper.pdf", "pub-type":"inproceedings", "confurl":"http://portal.acm.org/toc.cfm?id=345910&dl=GUIDE&dl=ACM&type=proceeding&idx=SERIES395&part=Proceedings&WantType=Proceedings", "booktitle":"Proceedings of the $6^{th}$ {ACM} International Conference on Mobile Computing and Networking ({MobiCom} 2000)", "address":"Boston, MA", "origin":"http://service.simile-widgets.org/babel/preview#A%20Scalable%20Location%20Service%20for%20Geographic%20Ad-Hoc%20Routing" }, {"id":"Sharing information in time-division duplexing channels: a network coding approach", "label":"Sharing information in time-division duplexing channels: a network coding approach", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:492a84f450f828577673065296a285a9", "modified":"no", "abstract":"We study random linear network coding for time-division duplexing channels for sharing information between nodes. We assume a packet erasure channel with nodes that cannot transmit and receive information simultaneously. Each node will act as both a sender of its own information and a receiver for the information of the other nodes. When a node acts as the sender, it transmits coded data packets back-to-back before stopping to wait for the receivers to acknowledge the number of degrees of freedom, if any, that are required to decode correctly the information. This acknowledgment comes in the header of the coded packets that are sent by the other nodes. We study the mean time to complete the sharing process between the nodes. We provide a simple algorithm to compute the number of coded packets to be sent back-to-back depending on the state of the system. We present numerical results for the case of two nodes sharing data and show that the mean completion time of our scheme is close to the performance of a full duplex network coding scheme and can outperform full duplex schemes with no coding.", "pages":"1403--1410", "date":"2009-09", "author":["Lucani, Daniel E.","M\\'{e}dard, Muriel","Stojanovic, Milica","Karger, David R."], "publisher":"IEEE Press", "month":"September", "cat":["Coding","P2P"], "key":"Lucani:Gossip", "year":"2009", "isbn":"978-1-4244-5870-7", "pub-type":"inproceedings", "booktitle":"Allerton'09: Proceedings of the 47th annual Allerton conference on Communication, control, and computing", "location":"Monticello, Illinois, USA", "address":"Piscataway, NJ, USA", "origin":"http://service.simile-widgets.org/babel/preview#Sharing%20information%20in%20time-division%20duplexing%20channels%3A%20a%20network%20coding%20approach" }, {"id":"Crowds in Two Seconds: Enabling Realtime Crowd-Powered Interfaces", "label":"Crowds in Two Seconds: Enabling Realtime Crowd-Powered Interfaces", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e11d5e6267ce21311f734c9a0edd181e", "modified":"no", "pages":"33--42", "date":"2011-11", "author":["Bernstein, Michael S.","Brandt, Joel","Miller, Robert C.","Karger, David R."], "doi":"http://doi.acm.org/10.1145/2047196.2047201", "acmid":"2047201", "venue":"UIST", "publisher":"ACM", "month":"November", "cat":["CHI","Systems","Crowdsourcing"], "series":"UIST '11", "key":"Karger:Adrenaline", "numpages":"10", "year":"2011", "isbn":"978-1-4503-0716-1", "pub-type":"inproceedings", "keywords":["crowdsourcing","human computation"], "booktitle":"Proceedings of the 24th annual ACM symposium on User interface software and technology", "location":"Santa Barbara, California, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Crowds%20in%20Two%20Seconds%3A%20Enabling%20Realtime%20Crowd-Powered%20Interfaces" }, {"id":"Load Balancing in P2P Systems", "label":"Load Balancing in P2P Systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cd0e7cea5b90ddfcba0d02c00e67e9b2", "modified":"no", "crossref":"IPTPS: 3rd International Workshop on Peer to Peer Systems", "abstract":" Load balancing is a critical issue for the efficient operation of peer-to-peer networks. We give new protocols for several scenarios, whose provable performance guarantees are within a constant factor of optimal. First, we give an improved version of consistent hashing, a scheme used for item to node assignments in the Chord system. In its original form, it required every network node to operate $O(\\log n)$ virtual nodes to achieve a balanced load, causing a corresponding increase in space and bandwidth usage. Our protocol eliminates the necessity of virtual nodes while maintaining a balanced load. Improving on related protocols, our scheme allows for the deletion of nodes and admits a simpler analysis, since the assignments do not depend on the history of the network. We then analyze several simple protocols for load sharing by movements of data from higher loaded to lower loaded nodes. These protocols can be extended to preserve the ordering of data items. As an application, we use the last protocol to give an efficient implementation of a distributed data structure for range searches on ordered data. ", "date":"2004-02", "author":["Karger, David R.","Ruhl, Matthias"], "venue":"IPTPS", "publisher":"Springer", "month":"February", "cat":["Theory","P2P"], "series":"LNCS Hot Topics", "key":"Karger:P2PLoadBalance-IPTPS", "year":"2004", "pdf":"http://iptps04.cs.ucsd.edu/papers/karger-load-balance.pdf", "pub-type":"inproceedings", "booktitle":"IPTPS: 3rd International Workshop on Peer to Peer Systems", "origin":"http://service.simile-widgets.org/babel/preview#Load%20Balancing%20in%20P2P%20Systems" }, {"id":"Decoding Turbo-Like Codes via Linear Programming", "label":"Decoding Turbo-Like Codes via Linear Programming", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8e9aa6e5f107400f54efef7fd256526c", "modified":"no", "note":"Special Issue of Best Papers from FOCS 2002.", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"We introduce a novel algorithm for decoding turbo-like codes based on linear programming. We prove that for the case of Repeat-Accumulate (RA) codes, under the binary symmetric channel with a certain constant threshold bound on the noise, the error probability of our algorithm is bounded by an inverse polynomial in the code length. Our linear program (LP) minimizes the distance between the received bits and binary variables representing the code bits. Our LP is based on a representation of the code where code words are paths through a graph. Consequently, the LP bears a strong resemblance to the min-cost flow LP. The error bounds are based on an analysis of the probability, over the random noise of the channel, that the optimum solution to the LP is the path corresponding to the original transmitted code word.", "pages":"733--752", "date":"2004", "author":["Feldman, Jon","Karger, David R."], "doi":"http://dx.doi.org/10.1016/j.jcss.2003.11.005", "volume":"68", "cat":["Theory","Coding","Cuts and Flows"], "journal":"Journal of Computer and System Sciences", "key":"Karger:Turbo-Journal", "year":"2004", "pdf":"http://www.columbia.edu/~jf2189/pubs/LpTurboDecoding.pdf", "pub-type":"article", "number":"4", "origin":"http://service.simile-widgets.org/babel/preview#Decoding%20Turbo-Like%20Codes%20via%20Linear%20Programming" }, {"id":"Atomate it! end-user context-sensitive automation using heterogeneous information sources on the web", "label":"Atomate it! end-user context-sensitive automation using heterogeneous information sources on the web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d7454a606738409eabdaf2be0e5b807a", "modified":"no", "abstract":"The transition of personal information management (PIM) tools off the desktop to the Web presents an opportunity to augment these tools with capabilities provided by the wealth of real-time information readily available. In this paper, we describe a next-generation personal information assistance engine that lets end-users delegate to it various simple context- and activity-reactive tasks and reminders. Our system, Atomate, treats RSS/ATOM feeds from social networking and life-tracking sites as sensor streams, integrating information from such feeds into a simple unified RDF world model representing people, places and things and their timevarying states and activities. Combined with other information sources on the web, including the user's online calendar, web-based e-mail client, news feeds and messaging services, Atomate can be made to automatically carry out a variety of simple tasks for the user, ranging from context-aware filtering and messaging, to sharing and social coordination actions. Atomate's open architecture and world model easily accommodate new information sources and actions via the addition of feeds and web services. To make routine use of the system easy for non-programmers, Atomate provides a constrained-input natural language interface (CNLI) for behavior specification, and a direct-manipulation interface for inspecting and updating its world model.", "pages":"951--960", "date":"2010-04", "author":["Van Kleek, Max","Moore, Brennan","Karger, David R.","Andr\\'{e}, Paul","schraefel, m. c."], "doi":"http://doi.acm.org/10.1145/1772690.1772787", "publisher":"ACM", "month":"April", "cat":["CHI","Information Retrieval","Semantic Web","Haystack"], "key":"Karger:Atomate", "year":"2010", "isbn":"978-1-60558-799-8", "pdf":"http://people.csail.mit.edu/emax/papers/atomate-www2010-camera.pdf", "pub-type":"inproceedings", "booktitle":"WWW '10: Proceedings of the 19th international conference on World wide web", "location":"Raleigh, North Carolina, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Atomate%20it!%20end-user%20context-sensitive%20automation%20using%20heterogeneous%20information%20sources%20on%20the%20web" }, {"id":"bf5736b4c839c4546d03b748a18dc2f7", "label":"Fast Connected Components Algorithms for the {EREW} {PRAM}", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:bf5736b4c839c4546d03b748a18dc2f7", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$4^{th}$} Annual {ACM}-{SIAM} Symposium on Parallel Algorithms and Architectures", "pages":"1021--1034", "date":"1999", "ps":"http://www.mta.ac.il/~michalp/Papers/connectivity.ps", "author":["Karger, David R.","Nisan, Noam","Parnas, Michal"], "volume":"28", "cat":"Theory", "journal":"SIAM Journal on Computing", "key":"Karger:Connectivity", "year":"1999", "pub-type":"article", "number":"3", "origin":"http://service.simile-widgets.org/babel/preview#bf5736b4c839c4546d03b748a18dc2f7" }, {"id":"Constant Interaction-Time Scatter/Gather Browsing of Very Large Document Collections", "label":"Constant Interaction-Time Scatter/Gather Browsing of Very Large Document Collections", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:0a749e9524edcb971175895d9d3239aa", "modified":"no", "note":"Pittsburgh, PA", "abstract":"The Scatter/Gather document browsing method uses fast document clustering to produce table-of-contents-like outlines of large document collections. Previous work [1] developed linear-time document clustering algorithms to establish the feasibility of this method over moderately large collections. However, even linear-time algorithms are too slow to support interactive browsing of very large collections such as Tipster, the DARPA standard text retrieval evaluation collection. We present a scheme that supports constant interaction-time Scatter/Gather of arbitrarily large collections after near-linear time preprocessing. This involves the construction of a cluster hierarchy. A modification of Scatter/Gather employing this scheme, and an example of its use over the Tipster collection are presented.", "pages":"126--134", "date":"1993-07", "ps":"http://people.csail.mit.edu/karger/Papers/sigir93.ps", "author":["Cutting, Douglas","Karger, David R.","Pedersen, Jan"], "venue":"SIGIR", "month":"July", "cat":"Information Retrieval", "key":"Karger:Scatter2", "year":"1993", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$16^{th}$} Annual International {ACM} {SIGIR} Conference on Research and Development in Information Retrieval", "origin":"http://service.simile-widgets.org/babel/preview#Constant%20Interaction-Time%20Scatter%2FGather%20Browsing%20of%20Very%20Large%20Document%20Collections" }, {"id":"Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections", "label":"Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:2091252fe73bc3ffbd806cbda1bc82a7", "modified":"no", "abstract":"Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably improve retrieval. We argue that these problems arise only when clustering is used in an attempt to improve conventional search techniques. However, looking at clustering as an information access tool in its own right obviates these objections, and provides a powerful new access paradigm. We present a document browsing technique that employs document clustering as its primary operation. We also present fast (linear time) clustering algorithms which support this interactive browsing paradigm.", "pages":"318-329", "date":"1992-06", "ps":"http://people.csail.mit.edu/karger/Papers/sigir92.ps", "author":["Cutting, Douglas","Karger, David R.","Pedersen, Jan","Tukey, John W."], "doi":"http://doi.acm.org/10.1145/133160.133214", "venue":"SIGIR", "publisher":"ACM Press", "month":"June", "cat":"Information Retrieval", "key":"Karger:Scatter", "year":"1992", "isbn":"0-89791-523-2", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$15^{th}$} Annual International {ACM} {SIGIR} Conference on Research and Development in Information Retrieval", "location":"Copenhagen, Denmark", "origin":"http://service.simile-widgets.org/babel/preview#Scatter%2FGather%3A%20A%20Cluster-based%20Approach%20to%20Browsing%20Large%20Document%20Collections" }, {"id":"e36b68943f1f6d1948f13cdcf964d62e", "label":"Minimum Cuts in Near-Linear Time", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e36b68943f1f6d1948f13cdcf964d62e", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$28^{th}$} {ACM} Symposium on Theory of Computing", "abstract":" We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a \"semiduality\" between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized (Monte Carlo) algorithm that finds a minimum cut in an m-edge, n-vertex graph with high probability in O(m log3 n) time. We also give a simpler randomized algorithm that finds all minimum cuts with high probability in O(m log3 n) time. This variant has an optimal RNC parallelization. Both variants improve on the previous best time bound of O(n2 log3 n). Other applications of the tree-packing approach are new, nearly tight bounds on the number of near-minimum cuts a graph may have and a new data structure for representing them in a space-efficient manner. ", "pages":"46--76", "date":"2000-01", "ps":"http://people.csail.mit.edu/karger/Papers/lincut-journal.ps", "pdf":"http://people.csail.mit.edu/karger/Papers/lincut-journal.pdf", "author":"Karger, David R.", "volume":"47", "month":"January", "cat":["Theory","Cuts and Flows"], "journal":"Journal of the ACM", "key":"Karger:Lincut", "year":"2000", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#e36b68943f1f6d1948f13cdcf964d62e" }, {"id":"Derandomization Through Approximation: An {${\\cal NC}$} Algorithm for Minimum Cuts", "label":"Derandomization Through Approximation: An {${\\cal NC}$} Algorithm for Minimum Cuts", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:33f3d94f0e7c82bc3981e17227fb3e81", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$25^{th}$} {ACM} Symposium on Theory of Computing", "abstract":"We show that the minimum cut and multi-cut problems in weighted undirected graphs can be solved in JVC. We do so by giving three separate and independently interesting results. The first is an m2/n processor JVC algorithm for a (2 + c)-approximation to the minimum cut. The second is a randomized reduction of the minimum cut problem to the problem of obtaining a (2+ e)-approximation to the minimum cut. This reduction involves a natural combinatorial Safe Sets Problem that can be solved easily in %?JVC. Our third result is a derandomization of this 7?JVC solution that requires a novel combination of two widely used tools: pairwise independence and random walks on expanders. We believe that the safe sets approach will prove useful in other derandomization problems.", "pages":"255--272", "date":"1997-01", "ps":"http://epubs.siam.org/sam-bin/dbq/article/27308", "author":["Karger, David R.","Motwani, Rajeev"], "volume":"26", "month":"January", "cat":["Theory","Cuts and Flows"], "journal":"SIAM Journal on Computing", "key":"Karger:DetCut", "year":"1997", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#Derandomization%20Through%20Approximation%3A%20An%20%7B%24%7B%5Ccal%20NC%7D%24%7D%20Algorithm%20for%20Minimum%20Cuts" }, {"id":"Optimal Rounding Algorithms for a Geometric Embedding of the Multiway Cut Problem", "label":"Optimal Rounding Algorithms for a Geometric Embedding of the Multiway Cut Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f110ff113e6775bb9f8e2b2480244e71", "modified":"no", "abstract":"Given an undirected graph with edge costs and a subset of k = 3 nodes called terminals, a multiway, or k-way, cut is a subset of the edges whose removal disconnects each terminal from the others. The multiway cut problem is to .nd a minimum-cost multiway cut. This problem is Max-SNP hard. Recently, Calinescu et al. (Calinescu, G., H. Karloff, Y. Rabani. 2000. An improved approximation algorithm for Multiway Cut. J. Comput. System Sci. 60(3) 564-574) gave a novel geometric relaxation of the problem and a rounding scheme that produced a (3/2-1/k)-approximation algorithm. In this paper, we study their geometric relaxation. In particular, we study the worst-case ratio between the value of the relaxation and the value of the minimum multicut (the so-called integrality gap of the relaxation). For k =3, we show the integrality gap is 12/11, giving tight upper and lower bounds. That is, we exhibit a family of graphs with integrality gaps arbitrarily close to 12/11 and give an algorithm that .nds a cut of value 12/11 times the relaxation value. Our lower bound shows that this is the best possible performance guarantee for any algorithm based purely on the value of the relaxation. Our upper bound meets the lower bound and improves the factor of 7/6 shown by Calinescu et al. For all k, we show that there exists a rounding scheme with performance ratio equal to the integrality gap, and we give explicit constructions of polynomial-time rounding schemes that lead to improved upper bounds. For k = 4 and 5, our best upper bounds are based on computer-constructed rounding schemes (with computer proofs of correctness). For general k we give an algorithm with performance ratio 1.3438-ek. Our results were discovered with the help of computational experiments that we also describe here.", "pages":"436--461", "date":"2004", "author":["Karger, David R.","Klein, Philip N.","Stein, Clifford","Thorup, Mikkel","Young, Neal"], "volume":"29", "cat":"Theory", "journal":"Mathematics of Operations Research", "key":"Karger:Multicut-Journal", "year":"2004", "pdf":"Papers/kcut-journal.pdf", "pub-type":"article", "number":"3", "origin":"http://service.simile-widgets.org/babel/preview#Optimal%20Rounding%20Algorithms%20for%20a%20Geometric%20Embedding%20of%20the%20Multiway%20Cut%20Problem" }, {"id":"Approximate Graph Coloring by Semidefinite Programming", "label":"Approximate Graph Coloring by Semidefinite Programming", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e6f9f07a2ebc5458d8b94a53cb2287ba", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "abstract":"We consider the problem of coloring k-colorable graphs with the fewest possible colors. We present a randomized polynomial time algorithm that colors a 3-colorable graph on n vertices with min{O(&Dgr;1/3 log1/2 &Dgr; log n), O(n1/4 log1/2 n)} colors where &Dgr; is the maximum degree of any vertex. Besides giving the best known approximation ratio in terms of n, this marks the first nontrivial approximation result as a function of the maximum degree &Dgr;. This result can be generalized to k-colorable graphs to obtain a coloring using min{O(&Dgr;1-2/k log1/2 &Dgr; log n), O(n1-3/(k+1) log1/2 n)} colors. Our results are inspired by the recent work of Goemans and Williamson who used an algorithm for semidefinite optimization problems, which generalize linear programs, to obtain improved approximations for the MAX CUT and MAX 2-SAT problems. An intriguing outcome of our work is a duality relationship established between the value of the optimum solution to our semidefinite program and the Lovेsz &thgr;-function. We show lower bounds on the gap between the optimum solution of our semidefinite program and the actual chromatic number; by duality this also demonstrates interesting new facts about the &thgr;-function.", "pages":"246--265", "date":"1998-03", "ps":"http://people.csail.mit.edu/karger/Papers/color.ps", "author":["Karger, David R.","Motwani, Rajeev","Sudan, Madhu"], "volume":"45", "month":"March", "cat":"Theory", "journal":"Journal of the ACM", "key":"Karger:Coloring-Journal", "year":"1998", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#Approximate%20Graph%20Coloring%20by%20Semidefinite%20Programming" }, {"id":"Observations on the Dynamic Evolution of Peer to Peer Systems", "label":"Observations on the Dynamic Evolution of Peer to Peer Systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cbf3b498c6dd8cb0771d897ed1f35f56", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the First International Workshop on Peer-to-Peer Systems", "abstract":"A fundamental theoretical challenge in peer-to-peer systems is proving statements about the evolution of the system while nodes are continuously joining and leaving. Because the system will operate for an infinite time, performance measures based on runtime are uninformative; instead, we must study the rate at which nodes consume resources in order to maintain the system state. This ``maintenance bandwidth'' depends on the rate at which nodes tend to enter and leave the system. In this paper, we formalize this dependence. Having done so, we analyze the Chord peer-to-peer protocol. We show that Chord's maintenance bandwidth to handle concurrent node arrivals and departures is near optimal, exceeding the lower bound by only a logarithmic factor. We also outline and analyze an algorithm that converges to a correct routing state from an arbitrary initial condition.", "date":"2002-03", "author":["Liben-Nowell, David","Balakrishnan, Hari","Karger, David R."], "venue":"IPTPS", "month":"March", "cat":["Theory","Applications of Theory","P2P"], "key":"Karger:P2P-Evol-IPTPS", "year":"2002", "pub-type":"inproceedings", "booktitle":"Proceedings of the First International Workshop on Peer-to-Peer Systems", "address":"Cambridge, MA", "origin":"http://service.simile-widgets.org/babel/preview#Observations%20on%20the%20Dynamic%20Evolution%20of%20Peer%20to%20Peer%20Systems" }, {"id":"47156eb66d65af58d6573ffae8ae3504", "label":"Decoding Turbo-Like Codes via Linear Programming", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:47156eb66d65af58d6573ffae8ae3504", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"9d8a5a66eb8bb509e64547a8c035f650", "place":"Vancouver, Canada", "abstract":"We introduce a novel algorithm for decoding turbo-like codes based on linear programming. We prove that for the case of Repeat-Accumulate (RA) codes, under the binary symmetric channel with a certain constant threshold bound on the noise, the error probability of our algorithm is bounded by an inverse polynomial in the code length. Our linear program (LP) minimizes the distance between the received bits and binary variables representing the code bits. Our LP is based on a representation of the code where code words are paths through a graph. Consequently, the LP bears a strong resemblance to the min-cost flow LP. The error bounds are based on an analysis of the probability, over the random noise of the channel, that the optimum solution to the LP is the path corresponding to the original transmitted code word.", "date":"2002-11", "author":["Feldman, Jon","Karger, David R."], "doi":"doi.ieeecomputersociety.org/10.1109/SFCS.2002.1181948", "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"November", "cat":["Theory","Coding","Cuts and Flows"], "key":"Karger:Turbo", "year":"2002", "pdf":"Papers/LpTurboDecoding.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "psgz":"Papers/LpTurboDecoding.ps.gz", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#47156eb66d65af58d6573ffae8ae3504" }, {"id":"Bridging High-Throughput Genetic and Transcriptional Data Reveals Cellular Responses to Alpha-Synuclein Toxicity", "label":"Bridging High-Throughput Genetic and Transcriptional Data Reveals Cellular Responses to Alpha-Synuclein Toxicity", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:b2bebf5193f4c4cb4561b412f6baf76a", "modified":"no", "pages":"316-23", "date":"2009-03", "author":["Yeger-Lotem, Esther","Riva, Laura","Su, L.J.","Gitler, A.D.","Cashikar, A.G.","King, O.D.","Auluck, P.K.","Geddie, M.L.","Valastyan, J.S.","Karger, David R.","Lindquist, Susan","Fraenkel, Ernest"], "url":"http://www.ncbi.nlm.nih.gov/pubmed/19234470", "volume":"41", "month":"March", "cat":["Theory","Applications of Theory"], "journal":"Nature Genetics", "key":"Karger:InteractomeFlow", "year":"2009", "pub-type":"article", "number":"3", "origin":"http://service.simile-widgets.org/babel/preview#Bridging%20High-Throughput%20Genetic%20and%20Transcriptional%20Data%20Reveals%20Cellular%20Responses%20to%20Alpha-Synuclein%20Toxicity" }, {"id":"It's All the Same to Me: Data Unification in Personal Information Management", "label":"It's All the Same to Me: Data Unification in Personal Information Management", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8b35037ddc45ef217a2c29bbaf5c7bfc", "modified":"no", "abstract":"Information fragmentation is a pervasive problem in personal information management. Even a seemingly simple decision, such as whether to say \"yes\" to a dinner invitation, often depends upon information from several sources---a calendar, a paper flyer, web sites, a previous email conversation, etc. This information is fragmented by the very tools that have been designed to help us manage it. Applications often store their data in their own particular locations and representations, inaccessible to other applications. Consider the information Alex maintains about Brooke. He must keep Brooke's address in his address book, his picture in a photo album, his home page in his web bookmarks, a birthday invitation he is editing with her in his file system, and an appointment with her in his calendar. This fragmentation causes numerous problems. There is no one \"directory\" Alex can use to find all the information about Brooke; nor any way to \"link\" pieces of information about Brooke to each other. Instead, Alex must launch multiple applications and perform numerous repetitive searches for relevant information, to say nothing of deciding which applications to look in. He may change data in one place (a new married name in the address book) and fail to change it elsewhere, leading to inconsistency that makes it even harder to find information (which name does Alex use to search the photo album?). While the computer has fragmented information, it can also be used to put the pieces together again. This chapter surveys some of the ways in which our personal information might be better unified.", "pages":"127-152", "date":"2007", "author":"Karger, David R.", "editor":"William Jones and Jaime Teevan", "publisher":"University of Washington Press", "cat":["Information Retrieval","Semantic Web"], "chapter":"8", "key":"Karger:PimBook", "year":"2007", "pdf":"Papers/pimchapter.pdf", "pub-type":"incollection", "booktitle":"Personal Information Management", "address":"Seattle, WA", "origin":"http://service.simile-widgets.org/babel/preview#It's%20All%20the%20Same%20to%20Me%3A%20Data%20Unification%20in%20Personal%20Information%20Management" }, {"id":"Linear-Time Poisson-Disk Patterns", "label":"Linear-Time Poisson-Disk Patterns", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f292f544e80df1e88180f687cb5e9f44", "modified":"no", "pages":"177--182", "date":"2011-10", "author":["Jones, Thouis","Karger, David R."], "volume":"15", "month":"October", "cat":"Applications of Theory", "journal":"Journal of Graphics, GPU, and Game Tools", "key":"Karger:Disks", "year":"2011", "pub-type":"article", "number":"3", "origin":"http://service.simile-widgets.org/babel/preview#Linear-Time%20Poisson-Disk%20Patterns" }, {"id":"Haystack: A Platform for Creating, Organizing, and Visualizing Semistructured Information", "label":"Haystack: A Platform for Creating, Organizing, and Visualizing Semistructured Information", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6f7af7b4ffcd4aa699095a439a8280d7", "modified":"no", "note":"Demo", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the 2003 International Conference on Intelligent User Interfaces", "date":"2003-01", "author":["Huynh, David","Karger, David R.","Quan, Dennis"], "bibsource":"DBLP, http://dblp.uni-trier.de", "pdfkb":"97", "venue":"IUI", "publisher":"ACM", "month":"January", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Haystack-IUI03-Poster", "year":"2003", "isbn":"1-58113-586-6", "pdf":"http://haystack.lcs.mit.edu/papers/iui2003-demo.pdf", "pub-type":["misc","Demo"], "confurl":"http://www.iuiconf.org/03program.html", "booktitle":"Intelligent User Interfaces", "address":"Miami, FL", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20A%20Platform%20for%20Creating%2C%20Organizing%2C%20and%20Visualizing%20Semistructured%20Information" }, {"id":"The Semantic Web and End Users: What's Wrong and How to Fix It", "label":"The Semantic Web and End Users: What's Wrong and How to Fix It", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d0b406bbb809ad5ad99ac6550842f8e6", "modified":"no", "abstract":"The Semantic Web's potential to deliver tools that help end users capture, communicate, and manage information has yet to be fulfilled, and far too little research is going into doing so. The author examines the poor state of current tools, argues that the Semantic Web offers a key part of the answer to building better ones, and discusses what needs to change in Semantic Web research to attain that goal.", "pages":"64-70", "date":"2014-11", "author":"Karger, David R.", "doi":"http://dx.doi.org/10.1109/MIC.2014.124", "volume":"18", "publisher":"IEEE", "date_0":"2014-11", "month":"November", "key":"Karger:SemWebPolemic", "year":"2014", "pub-type":"article", "booktitle":"{IEEE} Internet Computing", "number":"6", "origin":"http://service.simile-widgets.org/babel/preview#The%20Semantic%20Web%20and%20End%20Users%3A%20What's%20Wrong%20and%20How%20to%20Fix%20It" }, {"id":"Job Scheduling in Rings", "label":"Job Scheduling in Rings", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1f5114909892b72a2d1ff92907dcb2d5", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$6^{th}$} Annual {ACM}-{SIAM} Symposium on Parallel Algorithms and Architectures", "abstract":"We give a distributed approximation algorithm for job scheduling in a ring architecture. In contrast to many other parallel scheduling models, the model we consider captures the influence of the underlying communications network by specifying that task migration from one processor to another takes time proportional to the distance between those two processors in the network. As a result, our algorithm must balance computational load and communication time. The algorithm is simple, requires no global control, and yields schedules of length at most 4.22 times optimal. We also give a lower bound on the performance of any distributed algorithm and the results of simulation experiments which suggest better performance than does our worst-case analysis.", "pages":"122--133", "date":"1997-09", "author":["Fizzano, Perry","Karger, David R.","Stein, Cliff","Wein, Joel"], "volume":"45", "month":"September", "cat":"Theory", "journal":"Journal of Parallel and Distributed Computing", "key":"Karger:Ring", "year":"1997", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#Job%20Scheduling%20in%20Rings" }, {"id":"A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees", "label":"A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d918c972bdcbdb9f424039a0248ab606", "modified":"no", "abstract":"We present a randomized linear-time algorithm to find a minimum spanning tree in a connected graph with edge weights. The algorithm uses random sampling in combination with a recently discovered linear-time algorithm for verifying a minimum spanning tree. Our computational model is a unit-cost random-access machine with the restriction that the only operations allowed on edge weights are binary comparisons.", "pages":"321--328", "date":"1995-03", "ps":"http://people.csail.mit.edu/karger/Papers/mst.ps", "author":["Karger, David R.","Klein, Philip N.","Tarjan, Robert E."], "url":"http://doi.acm.org/10.1145/201019.201022", "volume":"42", "month":"March", "cat":"Theory", "journal":"Journal of the ACM", "key":"Karger:MST-Journal", "year":"1995", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#A%20Randomized%20Linear-Time%20Algorithm%20to%20Find%20Minimum%20Spanning%20Trees" }, {"id":"Data Unification in Personal Information Management", "label":"Data Unification in Personal Information Management", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:54d922b12e68c440d3216c2eeb7392d7", "modified":"no", "abstract":"Users need ways to unify, simplify, and consolidate information too often fragmented by location, device, and software application.", "pages":"77-82", "date":"2006-01", "author":["Karger, David R.","Jones, William"], "doi":"http://doi.acm.org/10.1145/1107458.1107496", "volume":"49", "month":"January", "cat":"Information Retrieval", "journal":"Communications of the ACM", "key":"Karger:CACM-Data", "year":"2006", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#Data%20Unification%20in%20Personal%20Information%20Management" }, {"id":"Approximation Algorithms for Orienteering and Discounted-Reward TSP", "label":"Approximation Algorithms for Orienteering and Discounted-Reward TSP", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:16496c0049e69222654455c42a7b5fe3", "modified":"no", "note":"Journal version appears in SIAM Journal on Computing 37(2)", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"b4a292aa37a5cd78a6a97eb99c733ba4", "place":"Cambridge, MA", "abstract":"In this paper, we give the first constant-factor approximation algorithm for the rooted ORIENTEERING problem, as well as a new problem that we call the DISCOUNTED-REWARD-TSP, motivated by robot navigation. In both problems, we are given a graph with lengths on edges and rewards on nodes, and a start node s. In the ORIENTEERING problem, the goal is to find a path starting at s that maximizes the reward collected, subject to a hard limit on the total length of the path. In the DISCOUNTEDREWARD- TSP, instead of a length limit we are given a discount factor ?, and the goal is to maximize total discounted reward collected, where reward for a node reached at time t is discounted by ?t. This problem is motivated by an approximation to a planning problem in theMarkov decision process (MDP) framework under the commonly employed infinite horizon discounted reward optimality criterion. The approximation arises from a need to deal with exponentially large state spaces that emerge when trying to model one-time events and non-repeatable rewards (such as for package deliveries). We also consider tree and multiple-path variants of these problems and provide approximations for those as well. Although the unrooted ORIENTEERING problem, where there is no fixed start node s, has been known to be approximable using algorithms for related problems such as k-TSP (in which the amount of reward to be collected is fixed and the total length is approximately minimized), ours is the first to approximate the rooted question, solving an open problem [3, 1]. We complement our approximation result for ORIENTEERING by showing that the problem is APX-hard. ", "date":"2003-10", "ps":"http://people.csail.mit.edu/karger/Papers/markovTSP.ps", "author":["Blum, Avrim","Chawla, Shuchi","Karger, David R.","Lane, Terran","Meyerson, Adam","Minkoff, Maria"], "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"October", "cat":"Theory", "key":"Karger:DiscountTSP", "year":"2003", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$36^{th}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#Approximation%20Algorithms%20for%20Orienteering%20and%20Discounted-Reward%20TSP" }, {"id":"e532770cb7db419fb29a42b660a88854", "label":"Approximate Graph Coloring by Semidefinite Programming", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e532770cb7db419fb29a42b660a88854", "modified":"no", "note":"Journal version appears in Journal of the ACM 45(2)", "crossref":"Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "place":"Santa Fe, NM", "abstract":"We consider the problem of coloring k-colorable graphs with the fewest possible colors. We present a randomized polynomial time algorithm that colors a 3-colorable graph on n vertices with min{O(&Dgr;1/3 log1/2 &Dgr; log n), O(n1/4 log1/2 n)} colors where &Dgr; is the maximum degree of any vertex. Besides giving the best known approximation ratio in terms of n, this marks the first nontrivial approximation result as a function of the maximum degree &Dgr;. This result can be generalized to k-colorable graphs to obtain a coloring using min{O(&Dgr;1-2/k log1/2 &Dgr; log n), O(n1-3/(k+1) log1/2 n)} colors. Our results are inspired by the recent work of Goemans and Williamson who used an algorithm for semidefinite optimization problems, which generalize linear programs, to obtain improved approximations for the MAX CUT and MAX 2-SAT problems. An intriguing outcome of our work is a duality relationship established between the value of the optimum solution to our semidefinite program and the Lovेsz &thgr;-function. We show lower bounds on the gap between the optimum solution of our semidefinite program and the actual chromatic number; by duality this also demonstrates interesting new facts about the &thgr;-function.", "pages":"2--13", "date":"1994-11", "author":["Karger, David R.","Motwani, Rajeev","Sudan, Madhu"], "editor":"Shafi Goldwasser", "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"November", "cat":"Theory", "key":"Karger:Coloring-Conf", "year":"1994", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#e532770cb7db419fb29a42b660a88854" }, {"id":"End-users Publishing Structured Information on the Web: An Observational Study of What, Why, and How", "label":"End-users Publishing Structured Information on the Web: An Observational Study of What, Why, and How", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e0b57758a6efd02779ab02eeb028faa9", "modified":"no", "pages":"1265--1274", "date":"2014-03", "author":["Benson, Edward","Karger, David R."], "url":"http://doi.acm.org/10.1145/2556288.2557036", "doi":"http://dx.doi.org/10.1145/2556288.2557036", "acmid":"2557036", "venue":"CHI", "publisher":"ACM", "month":"March", "cat":["CHI","Visualization","Ethnography","Haystack"], "series":"CHI '14", "key":"Karger:ExhibitUsers", "numpages":"10", "year":"2014", "isbn":"978-1-4503-2473-1", "pdf":"http://edwardbenson.com/papers/chi2014-exhibit-study.pdf", "pub-type":"inproceedings", "keywords":["faceted browsing","information architectures","web content editing","web design"], "booktitle":"Proceedings of the 32Nd Annual ACM Conference on Human Factors in Computing Systems", "location":"Toronto, Ontario, Canada", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#End-users%20Publishing%20Structured%20Information%20on%20the%20Web%3A%20An%20Observational%20Study%20of%20What%2C%20Why%2C%20and%20How" }, {"id":"The Pathetic Fallacy of RDF", "label":"The Pathetic Fallacy of RDF", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e0907c24914dff6af40a6be66896ba47", "modified":"no", "note":"collocated with ISWC 2006", "date":"2006-11", "author":["Karger, David R.","schraefel, m. c."], "url":"http://swui.semanticweb.org/swui06/papers/Karger/Pathetic_Fallacy.html", "venue":"SWUI", "month":"November", "cat":["CHI","Information Retrieval","Semantic Web"], "key":"Karger:Pathetic", "year":"2006", "pub-type":"inproceedings", "booktitle":" SWUI 2006 - 3rd International Semantic Web User Interaction Workshop", "location":"Athens, Georgia, USA", "origin":"http://service.simile-widgets.org/babel/preview#The%20Pathetic%20Fallacy%20of%20RDF" }, {"id":"Haystack: A Platform for Authoring End-User Semantic Web Applications", "label":"Haystack: A Platform for Authoring End-User Semantic Web Applications", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:eca7f80e58ffc140a5428e4ab832f8bb", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"$2^{nd}$ International Semantic Web Conference", "link":"http://www.springerlink.com/index/H74TVQB63J2DF9W8", "abstract":"The Semantic Web promises to open innumerable opportunities for automation and information retrieval by standardizing the protocols for metadata exchange. However, just as the success of the World Wide Web can be attributed to the ease of use and ubiquity of Web browsers, we believe that the unfolding of the Semantic Web vision depends on users getting powerful but easy-to-use tools for managing their information. But unlike HTML, which can be easily edited in any text editor, RDF is more complicated to author and does not have an obvious presentation mechanism. Previous work has concentrated on the ideas of generic RDF graph visualization and RDF Schema-based form generation. In this paper, we present a comprehensive platform for constructing end user applications that create, manipulate, and visualize arbitrary RDF-encoded information, adding another layer to the abstraction cake. We discuss a programming environment specifically designed for manipulating RDF and introduce user interface concepts on top that allow the developer to quickly assemble applications that are based on RDF data models. Also, because user interface specifications and program logic are themselves describable in RDF, applications built upon our framework enjoy properties such as network updatability, extensibility, and end user customizability---all desirable characteristics in the spirit of the Semantic Web. ", "pages":"738--753", "date":"2003-10", "author":["Quan, Dennis","Huynh, David","Karger, David R."], "url":"http://haystack.csail.mit.edu/papers/iswc2003-haystack", "pdfkb":"153", "venue":"ISWC", "month":"October", "cat":["Information Retrieval","Haystack","Semantic Web"], "key":["Karger:SemApps","$2^{nd}$ International Semantic Web Conference"], "year":"2003", "pdf":"http://haystack.lcs.mit.edu/papers/iswc2003-haystack.pdf", "pub-type":"inproceedings", "confurl":"http://iswc2003.semanticweb.org/", "booktitle":"$2^{nd}$ International Semantic Web Conference", "address":"Sanibel Island, FL", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20A%20Platform%20for%20Authoring%20End-User%20Semantic%20Web%20Applications" }, {"id":"Empirical Development of an Exponential Probabilistic Model for Text Retrieval: Using Textual Analysis to Build a Better Model", "label":"Empirical Development of an Exponential Probabilistic Model for Text Retrieval: Using Textual Analysis to Build a Better Model", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:02ed1b9610c6f45f03bc010ae88df050", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"$26^{th}$ Internationl ACM SIGIR Conference", "abstract":"Much work in information retrieval focuses on using a model of documents and queries to derive retrieval algorithms. Model based development is a useful alternative to heuristic development because in a model the assumptions are explicit and can be examined and refined independent of the particular retrieval algorithm. We explore the explicit assumptions underlying the naive framework by performing computational analysis of actual corpora and queries to devise a generative document model that closely matches text. Our thesis is that a model so developed will be more accurate than existing models, and thus more useful in retrieval, as well as other applications. We test this by learning from a corpus the best document model. We find the learned model better predicts the existence of text data and has improved performance on certain IR tasks.", "date":"2003-07", "ps":"http://haystack.csail.mit.edu/documents/papers/2003/teevan.sigir03.ps", "author":["Teevan, Jaime","Karger, David R."], "venue":"SIGIR", "publisher":"ACM", "month":"July", "cat":["Information Retrieval","Machine Learning"], "key":"Karger:BayesIR", "year":"2003", "pdf":"http://haystack.csail.mit.edu/documents/papers/2003/teevan.sigir03.pdf", "pub-type":"inproceedings", "confurl":"http://www.sigir2003.org", "booktitle":"$26^{th}$ Internationl ACM SIGIR Conference", "organization":"ACM SIGIR", "address":"Toronto", "origin":"http://service.simile-widgets.org/babel/preview#Empirical%20Development%20of%20an%20Exponential%20Probabilistic%20Model%20for%20Text%20Retrieval%3A%20Using%20Textual%20Analysis%20to%20Build%20a%20Better%20Model" }, {"id":"Standards opportunities around data-bearing Web pages", "label":"Standards opportunities around data-bearing Web pages", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:9b92425a50be73bbc765cfa197c96c72", "modified":"no", "pages":"20120381", "date":"2013-03", "author":"Karger, David R.", "doi":"http://doi.acm.org/10.1098/rsta.2012.0381", "volume":"371", "publisher":"The Royal Society", "month":"March", "cat":["CHI","Semantic Web"], "journal":"Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences", "key":"Karger:VizStandards", "year":"2013", "pdf":"Papers/standards.pdf", "pub-type":"article", "number":"1987", "origin":"http://service.simile-widgets.org/babel/preview#Standards%20opportunities%20around%20data-bearing%20Web%20pages" }, {"id":"User Interfaces for Supporting Multiple Categorization.", "label":"User Interfaces for Supporting Multiple Categorization.", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:c351b82ba3b348f71041ddd1d794ac68", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"INTERACT: $9^{th}$ IFIP International Conference on Human Computer Interaction", "abstract":"As the amount of information stored on and accessed by computer has increased over the past twenty years, the tools available for organizing and retrieving such information have become outdated. The folder paradigm has dominated existing user interfaces as the primary mechanism for organizing information for day-to-day use. This paradigm encourages many-to-one placement of documents into strictly hierarchical containers. In this paper we examine an alternative organization and navigation mechanism that promotes membership in multiple overlapping categories (as opposed to storage containment). In particular, we explore the user interface consequences of multiple categorization support being made conveniently available from within Web browsers. We have carried out user studies providing evidence that compared to the folder paradigm, multiple categorization not only improves organization and retrieval times but also matches more closely with the way users naturally think about organizing their information.", "pages":"228--235", "date":"2003-09", "author":["Quan, Dennis","Bakshi, Karun","Huynh, David","Karger, David R."], "pdfkb":"175", "venue":"INTERACT", "month":"September", "cat":["Information Retrieval","Semantic Web","CHI"], "key":"Karger:MultipleCategorization", "year":"2003", "pdf":"http://haystack.csail.mit.edu/documents/papers/2003/interact2003-multicat.pdf", "pub-type":"inproceedings", "confurl":"http://www.interact2003.org/", "booktitle":"INTERACT: $9^{th}$ IFIP International Conference on Human Computer Interaction", "organization":"International Federation for Information Processing", "address":"Zurich", "origin":"http://service.simile-widgets.org/babel/preview#User%20Interfaces%20for%20Supporting%20Multiple%20Categorization." }, {"id":"Efficient crowdsourcing for multi-class labeling", "label":"Efficient crowdsourcing for multi-class labeling", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:30697fd5611c563bd86de7ffecf3232e", "modified":"no", "pages":"81--92", "date":"2013-06", "author":["Karger, David R.","Oh, Sewoong","Shah, Devavrat"], "url":"http://doi.acm.org/10.1145/2465529.2465761", "doi":"10.1145/2465529.2465761", "acmid":"2465761", "publisher":"ACM", "date_0":"2013-06", "month":"June", "cat":["Theory","Applications of Theory","Machine Learning","Crowdsourcing"], "series":"SIGMETRICS '13", "key":"Karger:MultiClassCrowd", "numpages":"12", "year":"2013", "isbn":"978-1-4503-1900-3", "pub-type":"inproceedings", "keywords":["crowdsourcing","human computation","low-rank matrix","random graphs"], "booktitle":"Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems", "location":"Pittsburgh, PA, USA", "organization":"ACM", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Efficient%20crowdsourcing%20for%20multi-class%20labeling" }, {"id":"Magnet: supporting navigation in semistructured data environments", "label":"Magnet: supporting navigation in semistructured data environments", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:82d9d67cd273eb7a02e02f456e4defda", "modified":"no", "abstract":"With the growing importance of systems containing arbitrary semistructured relationships, the need for supporting users searching in such repositories has grown. Currently support for users' search needs either has required domain-specific user interfaces or has required users to be schema experts. We have developed a generalpurpose tool that offers users helpful navigation and refinement options for seeking information in these semistructured repositories. We show how a tool can be built without requiring domain-specific assumptions about the information being explored. In addition to describing a general approach to the problem, we provide a set of natural, general-purpose refinement tactics, many generalized from past work on textual information retrieval.", "pages":"97--106", "date":"2005-06", "author":["Sinha, Vineet","Karger, David R."], "doi":"http://doi.acm.org/10.1145/1066157.1066169", "venue":"SIGMOD", "publisher":"ACM Press", "month":"June", "cat":["Semantic Web","Information Retrieval"], "key":"Karger:Magnet", "year":"2005", "isbn":"1-59593-060-4", "pdf":"http://haystack.lcs.mit.edu/papers/magnet-sigmod2005.pdf", "pub-type":"inproceedings", "booktitle":"SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data", "location":"Baltimore, Maryland", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Magnet%3A%20supporting%20navigation%20in%20semistructured%20data%20environments" }, {"id":"Tackling the Poor Assumptions of Naive Bayes Text Classifiers", "label":"Tackling the Poor Assumptions of Naive Bayes Text Classifiers", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1838a7bd617efaa0a36c7dc634e67c39", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"The Twentieth International Conference on Machine Learning", "abstract":"Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive Bayes classifiers, addressing both systemic issues as well as problems that arise because text is not actually generated according to a multinomial model. We find that our simple corrections result in a fast algorithm that is competitive with stateof- the-art text classification algorithms such as the Support Vector Machine.", "date":"2003-08", "author":["Rennie, Jason D. M.","Shih, Lawrence","Teevan, Jaime","Karger, David R."], "url":"http://www.hpl.hp.com/conferences/icml03/", "venue":"ICML", "month":"August", "cat":["Information Retrieval","Machine Learning"], "key":"Karger:FixBayes", "year":"2003", "pdf":"http://haystack.csail.mit.edu/documents/papers/2003/rennie.icml03.pdf", "pub-type":"inproceedings", "booktitle":"The Twentieth International Conference on Machine Learning", "psgz":"http://haystack.csail.mit.edu/documents/papers/2003/rennie.icml03.ps.gz", "address":"Washington, DC", "origin":"http://service.simile-widgets.org/babel/preview#Tackling%20the%20Poor%20Assumptions%20of%20Naive%20Bayes%20Text%20Classifiers" }, {"id":"Deterministic Network Coding by Matrix Completion", "label":"Deterministic Network Coding by Matrix Completion", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:57a4973f2ab26e6d8b65f1be1023486e", "modified":"no", "crossref":"Proceedings of the {$16^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Vancouver, BC", "date":"2005-01", "ps":"Papers/detnetcode.ps", "author":["Harvey, Nicholas J. A.","Karger, David R.","Murota, Kazuo"], "abstsract":" We present a new deterministic algorithm to construct network codes for multicast problems, a particular class of network information ow problems. Our algorithm easily generalizes to several variants of multicast problems. Our approach is based on a new algorithm for maximum-rank completion of mixed matrices---taking a matrix whose entries are a mixture of numeric values and symbolic variables, and assigning values to the variables so as to maximize the resulting matrix rank. Our algorithm is faster than existing deterministic algorithms and can operate over a smaller field. ", "venue":"SODA", "month":"January", "cat":["Theory","Coding"], "key":"Karger:DetNetCoding", "year":"2005", "pdf":"Papers/detnetcode.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$16^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Deterministic%20Network%20Coding%20by%20Matrix%20Completion" }, {"id":"Koorde: A Simple Degree-Optimal Distributed Hash Table", "label":"Koorde: A Simple Degree-Optimal Distributed Hash Table", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:0d690b114eec1cd9b8e3f8891a2ceed5", "modified":"no", "crossref":"2nd International Workshop on Peer to Peer Systems", "link":"http://www.springerlink.com/index/UNMQCQY0YXPU32XP", "abstract":"Koorde is a new distributed hash table (DHT) based on Chord [15] and the de Bruijn graphs [2]. While inheriting the simplicity of Chord, Koorde meets various lower bounds, such as O(log n) hops per lookup request with only 2 neighbors per node (where n is the number of nodes in the DHT), and O(log n/ log log n) hops per lookup request with O(log n) neighbors per node.", "date":"2003-01", "ps":"http://iptps03.cs.berkeley.edu/finalpapers/koorde.ps", "author":["Kaashoek, M. Frans","Karger, David R."], "editor":"M. Frans Kaashoek and Ion Stoica", "venue":"IPTPS", "publisher":"Springer", "month":"January", "cat":["Theory","Systems","P2P","Applications of Theory"], "series":"LNCS Hot Topics", "key":"Karger:Koorde-IPTPS", "year":"2003", "pub-type":"inproceedings", "confurl":"http://iptps03.cs.berkeley.edu/", "booktitle":"2nd International Workshop on Peer to Peer Systems", "address":"Berkeley, CA", "origin":"http://service.simile-widgets.org/babel/preview#Koorde%3A%20A%20Simple%20Degree-Optimal%20Distributed%20Hash%20Table" }, {"id":"Budget-optimal task allocation for reliable crowdsourcing systems", "label":"Budget-optimal task allocation for reliable crowdsourcing systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6399a8658fb4031f01bb3ba37dd4dec2", "modified":"no", "pages":"1--24", "date":"2014-02", "author":["Karger, David R.","Oh, Sewoong","Shah, Devavrat"], "volume":"62", "publisher":"INFORMS", "date_0":"2014-02", "month":"February", "cat":["Applications of Theory","Theory","Crowdsourcing","Machine Learning"], "journal":"Operations Research", "key":"Karger:karger2014budget", "year":"2014", "pdf":"http://arxiv.org/abs/1110.3564", "pub-type":"article", "number":"1", "origin":"http://service.simile-widgets.org/babel/preview#Budget-optimal%20task%20allocation%20for%20reliable%20crowdsourcing%20systems" }, {"id":"24161f254ca20ee674d034c98e4b0683", "label":"Derandomization Through Approximation: An {${\\cal NC}$} Algorithm for Minimum Cuts", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:24161f254ca20ee674d034c98e4b0683", "modified":"no", "note":"Journal version appears in SIAM Journal on Computing 26(1)", "crossref":"Proceedings of the {$25^{th}$} {ACM} Symposium on Theory of Computing", "place":"San Diego, CA", "abstract":"We show that the minimum cut and multi-cut problems in weighted undirected graphs can be solved in JVC. We do so by giving three separate and independently interesting results. The first is an m2/n processor JVC algorithm for a (2 + c)-approximation to the minimum cut. The second is a randomized reduction of the minimum cut problem to the problem of obtaining a (2+ e)-approximation to the minimum cut. This reduction involves a natural combinatorial Safe Sets Problem that can be solved easily in %?JVC. Our third result is a derandomization of this 7?JVC solution that requires a novel combination of two widely used tools: pairwise independence and random walks on expanders. We believe that the safe sets approach will prove useful in other derandomization problems.", "pages":"497--506", "date":"1993-05", "ps":"http://people.csail.mit.edu/karger/Papers/detcut.ps", "author":["Karger, David R.","Motwani, Rajeev"], "editor":"Alok Aggarwal", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:DetCut-Conf", "year":"1993", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$25^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#24161f254ca20ee674d034c98e4b0683" }, {"id":"Haystack: Per-User Information Environments", "label":"Haystack: Per-User Information Environments", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f49ab85492efaa2d8784180f76de24b9", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":" Traditional Information Retrieval (IR) systems are designed to provide uniform access to centralized corpora by large numbers of people. The Haystack project emphasizes the relationship between a particular individual and his corpus. An individual's own haystack priviliges information with which that user interacts, gathers data about those interactions, and uses this metadata to further personalize the retrieval process. This paper describes the prototype Haystack system. ", "pages":"413--422", "date":"1999-11", "ps":"Papers/cikm99.ps", "author":["Adar, Eytan","Karger, David R.","Stein, Lynn Andrea"], "venue":"CIKM", "month":"November", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Haystack-CIKM", "year":"1999", "pdf":"Papers/cikm99.pdf", "pub-type":"inproceedings", "confurl":"http://portal.acm.org/toc.cfm?id=319950&dl=ACM&type=proceeding&idx=SERIES772&part=Proceedings&WantType=Proceedings", "booktitle":"Proceedings of the 8th International Conference on Information and Knowledge Management", "psgz":"Papers/cikm99.ps.gz", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20Per-User%20Information%20Environments" }, {"id":"Arpeggio: Metadata Searching and Content Sharing with Chord", "label":"Arpeggio: Metadata Searching and Content Sharing with Chord", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:08e0500e3a9adda7c414bd7cfd3d3229", "modified":"no", "crossref":"4th International Workshop on Peer to Peer Systems", "abstract":"Arpeggio is a peer-to-peer file-sharing network based on the Chord lookup primitive. Queries for data whose metadata matches a certain criterion are performed efficiently by using a distributed keyword-set index, augmented with index-side filtering. We introduce index gateways, a technique for minimizing index maintenance overhead. Because file data is large, Arpeggio employs subrings to track live source peers without the cost of inserting the data itself into the network. Finally, we introduce postfetching, a technique that uses information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization, and a content distribution system tuned to the requirements and capabilities of a peer-to-peer network.", "pages":"58--68", "date":"2005-02", "author":["Clements, Austin T.","Ports, Dan R. K.","Karger, David R."], "doi":"http://dx.doi.org/10.1007/11558989_6", "editor":"M. Frans Kaashoek and Ion Stoica", "venue":"IPTPS", "publisher":"Springer", "month":"February", "cat":["Systems","P2P"], "series":"LNCS Hot Topics", "key":"Karger:Arpeggio", "year":"2005", "pdf":"http://project-iris.net/irisbib/papers/arpeggio:iptps05/paper.pdf", "pub-type":"inproceedings", "confurl":"http://iptps05.cs.cornell.edu/", "booktitle":"4th International Workshop on Peer to Peer Systems", "address":"Ithaca, NY", "origin":"http://service.simile-widgets.org/babel/preview#Arpeggio%3A%20Metadata%20Searching%20and%20Content%20Sharing%20with%20Chord" }, {"id":"Techniques for Scheduling with Rejection", "label":"Techniques for Scheduling with Rejection", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cb1238b56f81cd103df89f55eeb447bb", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "coden":"LNCSD9", "abstract":"We consider the general problem of scheduling a set of jobs where we may choose not to schedule certain jobs, and thereby incur a penalty for each rejected job. More specifically, we focus on choosing a set of jobs to reject and constructing a schedule for the remaining jobs so as to optimize the sum of the weighted completion times of the jobs scheduled plus the sum of the penalties of the jobs rejected. We give several techniques for designing scheduling algorithms under this criterion. Many of these techniques show how to reduce a problem with rejection to a (potentially more complex) scheduling problem without re- jection. Some of the reductions are based on general properties of certain kinds of linear-programming relaxations of optimization problems, and therefore are applicable to problems outside of scheduling; we demon- strate this by giving an approximation algorithm for a variant of the facility-location problem. In the last section of the paper we consider a different notion of rejec- tion in the context of scheduling: scheduling jobs with due dates so as to maximize the number of jobs that complete by their due dates, or equivalently to minimize the number of jobs that do not complete by their due date and that thus can be considered \\rejected.\" We inves- tigate the approximability of a simple version of this problem, giving approximation algorithms and characterizing integrality gaps of a class of linear-programming relaxations. ", "pages":"490", "date":"1998-08", "ps":"Papers/rejection.ps", "author":["Engels, Daniel W.","Karger, David R.","Kolliopoulos, S. G.","Sengupta, S.","Uma, R. N.","Wein, Joel"], "volume":"1461", "venue":"ESA", "month":"August", "cat":"Theory", "key":"Karger:Rejection", "year":"1998", "pub-type":"inproceedings", "booktitle":"European Symposium on Algorithms (Lecture Notes in Computer Science) ", "issn":"0302-9743", "origin":"http://service.simile-widgets.org/babel/preview#Techniques%20for%20Scheduling%20with%20Rejection" }, {"id":"Finding the Hidden Path: Time Bounds for All-Pairs Shortest Paths", "label":"Finding the Hidden Path: Time Bounds for All-Pairs Shortest Paths", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:deff8c2193e4a2d496672563bfb0171e", "modified":"no", "note":"Journal version appears in SIAM Journal on Computing 22(6)", "crossref":"Proceedings of the {$32^{nd}$} Annual Symposium on the Foundations of Computer Science", "place":"San Juan, Puerto Rico", "abstract":"The all-pairs shortest paths problem in weighted graphs is investigated. An algorithm called the hidden paths algorithm, which finds these paths in time O(m*+n n2 log n), where m* is the number of edges participating in shortest paths, is presented. It is argued that m* is likely to be small in practice, since m*=O(n log n) with high probability for many probability distributions on edge weights. An O(mn) lower bound on the running time of any path-comparison-based algorithm for the all-pairs shortest paths problem is proved", "pages":"560--568", "date":"1991-10", "ps":"http://people.csail.mit.edu/karger/Papers/path.focs.ps", "author":["Karger, David R.","Koller, Daphne","Phillips, Steven J."], "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"October", "cat":"Theory", "key":"Karger:Paths-Conf", "year":"1991", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$32^{nd}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#Finding%20the%20Hidden%20Path%3A%20Time%20Bounds%20for%20All-Pairs%20Shortest%20Paths" }, {"id":"Learning Markov Networks: Maximum Bounded Tree-width Graphs", "label":"Learning Markov Networks: Maximum Bounded Tree-width Graphs", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e85c4167c4992be14ffc8e00451cd28f", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$12^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Washington, DC", "abstract":"Markov networks are a common class of graphical models used in machine learning. Such models use an undirected graph to capture dependency information among random variables in a joint probability distribution. Once one has chosen to use a Markov network model, one aims to choose the model that ``best explains'' the data that has been observed---this model can then be used to make predictions about future data. We show that the problem of learning a maximum likelihood Markov network given certain observed data can be reduced to the problem of identifying a maximum weight low-treewidth graph under a given input weight function. We give the first constant factor approximation algorithm for this problem. More precisely, for any fixed treewidth objective k, we find a treewidth-k graph with an f(k) fraction of the maximum possible weight of any treewidth-k graph.", "pages":"392--401", "date":"2001-01", "author":["Karger, David R.","Srebro, Nati"], "postscript":"Papers/bayes.ps", "editor":"S. Rao Kosaraju", "venue":"SODA", "month":"January", "cat":["Theory","Machine Learning"], "key":"Karger:Markov", "year":"2001", "pub-type":"inproceedings", "confurl":"http://portal.acm.org/toc.cfm?id=365411&dl=GUIDE&dl=ACM&type=proceeding&idx=SERIES422&part=Proceedings&WantType=Proceedings", "booktitle":"Proceedings of the {$12^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "psgz":"Papers/bayes.ps.gz", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Learning%20Markov%20Networks%3A%20Maximum%20Bounded%20Tree-width%20Graphs" }, {"id":"Scheduling Algorithms", "label":"Scheduling Algorithms", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:52113118578f3856c6df492a9ed464db", "modified":"no", "date":"1998", "author":["Karger, David R.","Stein, Clifford","Wein, Joel"], "editor":"Mikhail J. Atallah", "publisher":"CRC Press", "cat":"Theory", "key":"Karger:Scheduling", "year":"1998", "isbn":"0849326494", "pub-type":"incollection", "booktitle":"Algorithms and Theory of Computation Handbook", "origin":"http://service.simile-widgets.org/babel/preview#Scheduling%20Algorithms" }, {"id":"Simple efficient load balancing algorithms for peer-to-peer systems", "label":"Simple efficient load balancing algorithms for peer-to-peer systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1c4fff105f37ca5b66aa50a464bd5ea6", "modified":"no", "note":"Preliminary version in IPTPS 2004", "abstract":"Load balancing is a critical issue for the efficient operation of peer-to-peer networks. We give two new load-balancing protocols whose provable performance guarantees are within a constant factor of optimal. Our protocols refine the consistent hashing data structure that underlies the Chord (and Koorde) P2P network. Both preserve Chord's logarithmic query time and near-optimal data migration cost.Consistent hashing is an instance of the distributed hash table (DHT) paradigm for assigning items to nodes in a peer-to-peer system: items and nodes are mapped to a common address space, and nodes have to store all items residing closeby in the address space.Our first protocol balances the distribution of the key address space to nodes, which yields a load-balanced system when the DHT maps items \"randomly\" into the address space. To our knowledge, this yields the first P2P scheme simultaneously achieving O(log n) degree, O(log n) look-up cost, and constant-factor load balance (previous schemes settled for any two of the three).Our second protocol aims to directly balance the distribution of items among the nodes. This is useful when the distribution of items in the address space cannot be randomized. We give a simple protocol that balances load by moving nodes to arbitrary locations \"where they are needed.\" As an application, we use the last protocol to give an optimal implementation of a distributed data structure for range searches on ordered data.", "pages":"36--43", "date":"2004-06", "ps":"Papers/loadbalancing-spaa.ps", "author":["Karger, David R.","Ruhl, Matthias"], "doi":"http://doi.acm.org/10.1145/1007912.1007919", "venue":"SPAA", "publisher":"ACM Press", "month":"June", "cat":["Theory","P2P"], "key":"Karger:P2PLoadBalance", "year":"2004", "isbn":"1-58113-840-7", "pdf":"Papers/loadbalancing-spaa.pdf", "pub-type":"inproceedings", "booktitle":"SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures", "location":"Barcelona, Spain", "origin":"http://service.simile-widgets.org/babel/preview#Simple%20efficient%20load%20balancing%20algorithms%20for%20peer-to-peer%20systems" }, {"id":"Consistent Hashing and Random Trees: Distributed Caching protocols for Relieving Hot Spots on the World Wide Web", "label":"Consistent Hashing and Random Trees: Distributed Caching protocols for Relieving Hot Spots on the World Wide Web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e9d540d2f1bb8289b5829545f8a79c73", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"98903371af780e968ea00c3bebd9ddd0", "place":"El Paso, TX", "abstract":"We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe, and where it is not feasible for every server to have complete information about the current state of the entire network. The protocols are easy to implement using existing network protocols such as TCP/IP, and require very little overhead. The protocols work with local control, make efficient use of existing resources, and scale gracefully as the network grows. Our caching protocols are based on a special kind of hashing that we call consistent hashing. Roughly speaking, a consistent hash function is one which changes minimally as the range of the function changes. Through the development of good consistent hash functions, we are able to develop caching protocols which do not require users to have a current or even consistent view of the network. We believe that consistent hash functions may eventually prove to be useful in other applications such as distributed name servers and/or quorum systems. ", "pages":"654--663", "date":"1997-05", "ps":"http://people.csail.mit.edu/karger/Papers/web.ps", "author":["Karger, David R.","Lehman, Eric","Leighton, Tom","Levine, Matthew","Lewin, Daniel","Panigrahy, Rina"], "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Applications of Theory"], "key":"Karger:Web", "year":"1997", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$29^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Consistent%20Hashing%20and%20Random%20Trees%3A%20Distributed%20Caching%20protocols%20for%20Relieving%20Hot%20Spots%20on%20the%20World%20Wide%20Web" }, {"id":"Analysis of the Evolution of Peer to Peer Systems", "label":"Analysis of the Evolution of Peer to Peer Systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a230c87a6eced63d1b12a096c0f3031b", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"In this paper, we give a theoretical analysis of peer-to-peer (P2P) networks operating in the face of concurrent joins and unexpected departures. We focus on Chord, a recently developed P2P system that implements a distributed hash table abstraction, and study the process by which Chord maintains its distributed state as nodes join and leave the system. We argue that traditional performance measures based on run-time are uninformative for a continually running P2P network, and that the rate at which nodes in the network need to participate to maintain system state is a more useful metric. We give a general lower bound on this rate for a network to remain connected, and prove that an appropriately modified version of Chord's maintenance rate is within a logarithmic factor of the optimum rate.", "pages":"233--242", "date":"2002-07", "ps":"Papers/podc2002.ps", "author":["Liben-Nowell, David","Balakrishnan, Hari","Karger, David R."], "venue":"PODC", "month":"July", "cat":["Theory","Applications of Theory","P2P"], "key":"Karger:P2P-Evol", "year":"2002", "pdf":"Papers/podc2002.pdf", "pub-type":"inproceedings", "booktitle":"ACM Symposium on Principles of Distributed Computing", "address":"Monterey, CA", "origin":"http://service.simile-widgets.org/babel/preview#Analysis%20of%20the%20Evolution%20of%20Peer%20to%20Peer%20Systems" }, {"id":"Using Random Sampling to Find Maximum Flows in Uncapacitated Undirected Graphs", "label":"Using Random Sampling to Find Maximum Flows in Uncapacitated Undirected Graphs", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8a95eb12c54abe40a32bb7bf67b6f8a2", "modified":"no", "crossref":"98903371af780e968ea00c3bebd9ddd0", "place":"El Paso, TX", "abstract":"We present new algorithms, based on random sampling, that find maximum flows in undirected incapacitated graphs. Our algorithms dominate augmenting paths over all parameter values (number of vertices and edges and flow value). They also dominate blocking flows over a large range of parameter values. Furthermore, they achieve time bounds on graphs with parallel (equivalently, capacitated) edges that previously could only be achieved on graphs without them. The key contribution of this paper is to demonstrate that such an improvement is possible. This shows that augmenting paths and blocking flows are non-optimal, and reopens the question of how fast we can find a maximum flow. We improve known time bounds by only a small (but polynomial) factor, and the complicated nature of our algorithms suggests they will not be practical. A new idea of our algorithm is to find flow by diminishing cuts instead of augmenting paths. Rather than finding a way to push flow from the source to the sink, we identify and delete edges that are not needed in a maximum flow. When no more edges can be deleted, we know that every remaining edge must be saturated to give a maximum flow. ", "pages":"240--249", "date":"1997-05", "ps":"http://people.csail.mit.edu/karger/Papers/flow.ps", "author":"Karger, David R.", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:Diminish", "year":"1997", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$29^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Using%20Random%20Sampling%20to%20Find%20Maximum%20Flows%20in%20Uncapacitated%20Undirected%20Graphs" }, {"id":"Information Scraps: How and Why Information Eludes our Personal Information Management Tools", "label":"Information Scraps: How and Why Information Eludes our Personal Information Management Tools", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:829383d8ae4e9582e3e0a37b5a1468c5", "modified":"no", "note":"special issue on PIM", "abstract":"In this article we investigate information scraps---personal information where content has been scribbled on Post-it notes, scrawled on the corners of sheets of paper, stuck in our pockets, sent in email messages to ourselves, and stashed in miscellaneous digital text files. Information scraps encode information ranging from ideas and sketches to notes, reminders, shipment tracking numbers, driving directions, and even poetry. Although information scraps are ubiquitous, we have much still to learn about these loose forms of information practice. Why do we keep information scraps outside of our traditional PIM applications? What role do information scraps play in our overall information practice? How might PIM applications be better designed to accommodate and support information scraps' creation, manipulation and retrieval? We pursued these questions by studying the information scrap practices of 27 knowledge workers at five organizations. Our observations shed light on information scraps' content, form, media, and location. From this data, we elaborate on the typical information scrap lifecycle, and identify common roles that information scraps play: temporary storage, archiving, work-in-progress, reminding, and management of unusual data. These roles suggest a set of unmet design needs in current PIM tools: lightweight entry, unconstrained content, flexible use and adaptability, visibility, and mobility.", "pages":"1--46", "date":"2008", "author":["Bernstein, Michael","Van Kleek, Max","Karger, David R.","schraefel, mc"], "doi":"http://doi.acm.org/10.1145/1402256.1402263", "volume":"26", "publisher":"ACM", "cat":["CHI","Information Retrieval","Haystack","Ethnography"], "journal":"ACM Transactions on Information Systems", "key":"Karger:Scraps", "year":"2008", "pub-type":"article", "issn":"1046-8188", "number":"4", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Information%20Scraps%3A%20How%20and%20Why%20Information%20Eludes%20our%20Personal%20Information%20Management%20Tools" }, {"id":"A Randomized Fully Polynomial Approximation Scheme for the All Terminal Network Reliability Problem", "label":"A Randomized Fully Polynomial Approximation Scheme for the All Terminal Network Reliability Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:ce0cbac6bcd115b51c84f91faece544f", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing. A corrected version was published in SIAM Review 43(3)", "abstract":"The classic all-terminal network reliability problem posits a graph, each of whose edges fails independently with some given probability. The goal is to determine the probability that the network becomes disconnected due to edge failures. This problem has obvious applications in the design of communication networks. Since the problem is $\\SP$-complete and thus believed hard to solve exactly, a great deal of research has been devoted to estimating the failure probability. In this paper, we give a fully polynomial randomized approximation scheme that, given any n-vertex graph with specified failure probabilities, computes in time polynomial in n and $1/\\epsilon$ an estimate for the failure probability that is accurate to within a relative error of $1\\pm\\epsilon$ with high probability. We also give a deterministic polynomial approximation scheme for the case of small failure probabilities. Some extensions to evaluating probabilities of k-connectivity, strong connectivity in directed Eulerian graphs and r-way disconnection, and to evaluating the Tutte polynomial are also described.", "pages":"492--514", "date":"1999", "ps":"http://people.csail.mit.edu/karger/Papers/reliability-journal.ps", "author":"Karger, David R.", "volume":"29", "cat":["Theory","Cuts and Flows"], "journal":"SIAM Journal on Computing", "key":"Karger:Reliability", "year":"1999", "brag":"Winner, SIAM Outstanding Paper Prize, 2000", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#A%20Randomized%20Fully%20Polynomial%20Approximation%20Scheme%20for%20the%20All%20Terminal%20Network%20Reliability%20Problem" }, {"id":"Efficient Algorithms for Fixed-Precision Instances of Bin Packing and Euclidean TSP ", "label":"Efficient Algorithms for Fixed-Precision Instances of Bin Packing and Euclidean TSP ", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:c605f1830e31ab83890978c62733eb19", "modified":"no", "pages":"104", "date":"2008-08", "author":["Karger, David R.","Scott, Jacob"], "month":"August", "cat":"Theory", "key":"Karger:FixedPrecisionBinPacking", "year":"2008", "pub-type":"inproceedings", "booktitle":"11th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (RANDOM/APPROX)", "origin":"http://service.simile-widgets.org/babel/preview#Efficient%20Algorithms%20for%20Fixed-Precision%20Instances%20of%20Bin%20Packing%20and%20Euclidean%20TSP%20" }, {"id":"OverCite: A Cooperative Digital Research Library", "label":"OverCite: A Cooperative Digital Research Library", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:5e01703b5b6b78896c15f24b8f24980f", "modified":"no", "crossref":"4th International Workshop on Peer to Peer Systems", "abstract":"CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate hardware and bandwidth at their own sites to assist CiteSeer, the current architecture does not facilitate such distribution of resources. OverCite is a design for a new architecture for a distributed and cooperative research library based on a distributed hash table (DHT). The new architecture harnesses donated resources at many sites to provide document search and retrieval service to researchers worldwide. A preliminary evaluation of an initial OverCite prototype shows that it can service more queries per second than a centralized system, and that it increases total storage capacity by a factor of n/4 in a system of n nodes. OverCite can exploit these additional resources by supporting new features such as document alerts, and by scaling to larger data sets.", "pages":"69--79", "date":"2005-02", "author":["Stribling, Jeremy","Councill, Isaac G.","Li, Jinyang","Kaashoek, M. Frans","Karger, David R.","Morris, Robert","Shenker, Scott"], "doi":"http://dx.doi.org/10.1007/11558989_7", "editor":"M. Frans Kaashoek and Ion Stoica", "venue":"IPTPS", "publisher":"Springer", "month":"February", "cat":["Systems","P2P"], "series":"LNCS Hot Topics", "key":"Karger:OverCite", "year":"2005", "pdf":"http://pdos.csail.mit.edu/papers/overcite:iptps05/paper.pdf", "pub-type":"inproceedings", "confurl":"http://iptps05.cs.cornell.edu/", "booktitle":"4th International Workshop on Peer to Peer Systems", "location":"Ithaca, NY", "address":"Ithaca, NY", "origin":"http://service.simile-widgets.org/babel/preview#OverCite%3A%20A%20Cooperative%20Digital%20Research%20Library" }, {"id":"RDF Authoring Environments for End Users", "label":"RDF Authoring Environments for End Users", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:45c89828cd3080d5da09543a18818b8c", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"The Semantic Web promises to open innumerable opportunities for automation and information retrieval by standardizing the protocols for metadata exchange. However, just as the success of the World Wide Web can be attributed to the ease of use and ubiquity of Web browsers, we believe that the unfolding of the Semantic Web vision depends on users getting powerful but easy-to-use tools for managing their information. But unlike HTML, which can be easily edited in any text editor, RDF is more complicated to author and does not have an obvious presentation mechanism. Previous work has concentrated on the ideas of generic RDF graph visualization and RDF Schemabased form generation. In this paper, we present a comprehensive platform for constructing end user applications that create, manipulate, and visualize arbitrary RDF-encoded information, adding another layer to the abstraction cake. We discuss a programming environment specifically designed for manipulating RDF and introduce user interface concepts on top that allow the developer to quickly assemble applications that are based on RDF data models. Also, because user interface specifications and program logic are themselves describable in RDF, applications built upon our framework enjoy properties such as network updatability, extensibility, and end user customizability---all desirable characteristics in the spirit of the Semantic Web.", "date":"2003-03", "author":["Quan, Dennis","Huynh, David","Karger, David R."], "pdfkb":"494", "venue":"SWFAT", "month":"March", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Haystack-swfat03", "year":"2003", "pdf":"http://haystack.csail.mit.edu/papers/swfat2003.pdf", "pub-type":"inproceedings", "confurl":"http://www-kasm.nii.ac.jp/SWFAT/proceedings.html", "booktitle":"International Workshop on Semantic Web Foundations and Application Technologies (SWFAT)", "origin":"http://service.simile-widgets.org/babel/preview#RDF%20Authoring%20Environments%20for%20End%20Users" }, {"id":"Random Sampling from Residual Graphs", "label":"Random Sampling from Residual Graphs", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f71e2273cda2a3f96834b2c018aad741", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$33^{rd}$} {ACM} Symposium on Theory of Computing", "place":"Montreal, Canada", "abstract":" Consider an n-vertex, m-edge, undirected graph with maximum flow value v. We give a new श(m+nv)-time maximum flow algorithm based on finding augmenting paths in random samples of the edges of residual graphs. After assigning certain special sampling probabilities to edges in श(m) time, our algorithm is very simple: repeatedly find an augmenting path in a random sample of edges from the residual graph. ", "pages":"63--66", "date":"2002-05", "ps":"Papers/resflow.ps", "author":["Karger, David R.","Levine, Matthew S."], "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:AugmentingPath-Conf", "year":"2002", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$33^{rd}$} {ACM} Symposium on Theory of Computing", "psgz":"Papers/resflow.ps.gz", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Random%20Sampling%20from%20Residual%20Graphs" }, {"id":"How to Make a Semantic Web Browser", "label":"How to Make a Semantic Web Browser", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:abb19b7e063c2c6d3de9c00ef3ff5cdf", "modified":"no", "crossref":"Proceedings of the $13^{th}$ International World Wide Web Conference", "abstract":"Two important architectural choices underlie the success of the Web: numerous, independently operated servers speak a common protocol, and a single type of client-the Web browser-provides point-and-click access to the content and services on these decentralized servers. However, because HTML marries content and presentation into a single representation, end users are often stuck with inappropriate choices made by the Web site designer of how to work with and view the content. RDF metadata on the Semantic Web does not have this limitation: users can gain direct access to the underlying information and control how it is presented for themselves. This principle forms the basis for our Semantic Web browser-an end user application that automatically locates metadata and assembles point-and-click interfaces from a combination of relevant information, ontological specifications, and presentation knowledge, all described in RDF and retrieved dynamically from the Semantic Web. With such a tool, naive users can begin to discover, explore, and utilize Semantic Web data and services. Because data and services are accessed directly through a standalone client and not through a central point of access (e.g., a portal), new content and services can be consumed as soon as they become available. In this way we take advantage of an important sociological force that encourages the production of new Semantic Web content by remaining faithful to the decentralized nature of the Web. ", "pages":"255--265", "date":"2004-05", "author":["Quan, Dennis","Karger, David R."], "venue":"WWW", "month":"May", "cat":["Information Retrieval","Haystack","Semantic Web"], "key":"Karger:SemBrowser", "year":"2004", "pdf":"http://www2004.org/proceedings/docs/1p255.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the $13^{th}$ International World Wide Web Conference", "origin":"http://service.simile-widgets.org/babel/preview#How%20to%20Make%20a%20Semantic%20Web%20Browser" }, {"id":"Subjective Cost Policy Routing", "label":"Subjective Cost Policy Routing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f22403f7cbff19269c81c40dbe9ced0d", "modified":"no", "abstract":"We study a model of interdomain routing in which autonomous systems' (ASes') routing policies are based on {\\em subjective} cost assessments of alternative routes. The routes are constrained by the requirement that all routes to a given destination must be confluent. We show that it is NP-hard to determine whether there is a set of stable routes. We also show that it is NP-hard to find a set of confluent routes that minimizes the total subjective cost; it is hard even to approximate minimum cost closely. These hardness results hold even for very restricted classes of subjective costs. We then consider a model in which the subjective costs are based on the relative importance ASes place on a small number of objective cost measures. We show that a small number of confluent routing trees is sufficient for each AS to have a route that nearly minimizes its subjective cost; these routing trees can be computed easily with a distributed algorithm. Furthermore, we prove that this bound is almost tight.", "pages":"174--183", "date":"2007-06", "ps":"http://www.umich.edu/~rsami/papers/wine.ps", "author":["Feigenbaum, Joan","Karger, David R.","Mirrokni, Vahab S.","Sami, Rahul"], "doi":"http://dx.doi.org/10.1007/11600930_18", "volume":"378", "month":"June", "cat":["Theory","Mechanism Design"], "journal":"Theoretical Computer Science", "key":"Karger:SubjectiveCostRouting-Journal", "year":"2007", "pdf":"http://cs-www.cs.yale.edu/homes/jf/FKMS.pdf", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#Subjective%20Cost%20Policy%20Routing" }, {"id":"Building Routing Trees with Incomplete Global Knowledge", "label":"Building Routing Trees with Incomplete Global Knowledge", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:5d8e44da53898a27fd3ca25bf7a299c7", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"ef119be3a36d2df2b787db0d61c2270a", "place":"Rodondo Beach, CA", "pages":"613--623", "date":"2000-11", "ps":"Papers/maybecast.ps", "author":["Karger, David R.","Minkoff, Maria"], "venue":"FOCS", "publisher":"IEEE Computer Society Press", "month":"November", "cat":"Theory", "key":"Karger:Maybecast", "year":"2000", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$33^{rd}$} Annual Symposium on the Foundations of Computer Science", "organization":"IEEE", "origin":"http://service.simile-widgets.org/babel/preview#Building%20Routing%20Trees%20with%20Incomplete%20Global%20Knowledge" }, {"id":"Random Sampling in Graph Optimization Problems", "label":"Random Sampling in Graph Optimization Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:17338ed9dfb5b93da01a5a0191952c20", "modified":"no", "abstract":"The representative random sample is a central concept of statistics. It is often possible to gather a great deal of information about a large population by examining a small sample randomly drawn from it. This approach has obvious advantages in reducing the investiga tor's work, both in gathering and in analyzing the data. We apply the concept of a representative sample to combinatorial optimization. Our general technique is to generate small random representative subproblems and solve them in lieu of the original ones, producing approximately correct answers which may then be refined to correct ones at little additional cost. Our focus is optimization problems on undirected graphs. Highlights of our results include \\begin{itemize} \\item The first (randomized) linear time minimum spanning tree algorithm \\item A (randomized) minimum cut algorithm with running time roughly $O(n^2)$ as compared to previous roughly $O(n^3)$ time bounds, as well as the first algorithm for finding all approximately minimal cuts and multiway cuts \\item An efficient parallelization of the minimum cut algorithm, providing the first parallel (RNC) algorithm for minimum cuts \\item The first proof that minimum cuts can be found deterministically in parallel (NC) \\item Reliability theorems tightly bounding the connectivities and bandwidths in networks with random edge failures, and a fully polynomial time approximation scheme for estimating all-terminal reliability---the probability a particular graph remains connected under edge failures \\item A linear time algorithm for approximating minimum cuts to within $1+\\epsilon$ and a linear processor parallel algorithm for $1+\\epsilon$ approximation, and fast algorithms for approximat ing $s$-$t$ minimum cuts and maximum Flows", "date":"1994", "ps":"Papers/thesis.ps", "author":"Karger, David R.", "department":"Computer Science", "cat":["Theory","Cuts and Flows"], "key":"Karger:Thesis", "year":"1994", "brag":"Winner, ACM Doctoral Dissertation Award, 1995. To be published by Springer Verlag", "pdf":"Papers/thesis.pdf", "pub-type":"phdthesis", "school":"Stanford University", "address":"Stanford, CA 94305", "origin":"http://service.simile-widgets.org/babel/preview#Random%20Sampling%20in%20Graph%20Optimization%20Problems" }, {"id":"A New Approach to the Minimum Cut Problem", "label":"A New Approach to the Minimum Cut Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:493e3342de500ade5b2593de56967ec8", "modified":"no", "note":"Preliminary portions appeared in SODA 1992 and STOC 1993", "pages":"601--640", "date":"1996-07", "ps":"http://people.csail.mit.edu/karger/Papers/contract.ps", "pdf":"http://people.csail.mit.edu/karger/Papers/contract.pdf", "author":["Karger, David R.","Stein, Clifford"], "doi":"http://doi.acm.org/10.1145/234533.234534", "volume":"43", "month":"July", "cat":["Theory","Cuts and Flows"], "journal":"Journal of the ACM", "key":"Karger:Contraction", "year":"1996", "pub-type":"article", "number":"4", "origin":"http://service.simile-widgets.org/babel/preview#A%20New%20Approach%20to%20the%20Minimum%20Cut%20Problem" }, {"id":"Haystack: A User Interface for Creating, Browsing and Organizing Arbitrary Semistructured Information", "label":"Haystack: A User Interface for Creating, Browsing and Organizing Arbitrary Semistructured Information", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:2e83777c5670b6635b427f34a962652c", "modified":"no", "note":"Demo", "abstract":" Much past HCI research has examined the usability concerns of information management software for specific domains such as object-oriented software design, e-mail, and the Web. We believe that many of the results uncovered by these studies are applicable across multiple domains but that more broadly-scoped experiments require a system that can integrate multiple data sources. Haystack is a general-purpose information management environment designed to attack this very problem. Haystack's user interface, which incorporates capabilities from previous research such as context-specific visualization paradigms and attribute-based categorization, is built upon a highly expressive semistructured data model and data integration capabilities. In our demonstration we show how combination of a direct-manipulation-based UI paradigm and an expressive, federated data model can begin to address many of the information management problems plaguing general desktop computing today and can serve as a basis for further, yet unexplored, crossover information interaction experiments. ", "date":"2004-04", "author":["Quan, Dennis","Karger, David R."], "doi":"http://doi.acm.org/10.1145/985921.985931", "venue":"CHI", "month":"April", "cat":["CHI","Information Retrieval","Haystack","Semantic Web"], "key":"Karger:HaystackDemo", "year":"2004", "pub-type":"inproceedings", "booktitle":"Proceedings of the ACM CHI Conference on Human Factors in Computing Systems", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20A%20User%20Interface%20for%20Creating%2C%20Browsing%20and%20Organizing%20Arbitrary%20Semistructured%20Information" }, {"id":"On the Feasibility of Peer-to-Peer Web Indexing and Search", "label":"On the Feasibility of Peer-to-Peer Web Indexing and Search", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:012df9991d2c35a0aa66247535843068", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"2nd International Workshop on Peer to Peer Systems", "abstract":"This paper discusses the feasibility of peer-to-peer full-text keyword search of the Web. Two classes of keyword search techniques are in use or have been proposed: flooding of queries over an overlay network (as in Gnutella), and intersection of index lists stored in a distributed hash table. We present a simple feasibility analysis based on the resource constraints and search workload. Our study suggests that the peer-to-peer network does not have enough capacity to make naive use of either of search techniques attractive for Web search. The paper presents a number of existing and novel optimizations for P2P search based on distributed hash tables, estimates their effects on performance, and concludes that in combination these optimizations would bring the problem to within an order of magnitude of feasibility. The paper suggests a number of compromises that might achieve the last order of magnitude.", "pages":"207--215", "date":"2003-01", "author":["Li, Jinyang","Loo, Book Thau","Hellerstein, Joe","Kaashoek, M. Frans","Karger, David R.","Morris, Robert"], "editor":"M. Frans Kaashoek and Ion Stoica", "venue":"IPTPS", "publisher":"Springer", "month":"January", "cat":["Systems","P2P"], "series":"LNCS Hot Topics", "key":"Karger:P2P-search", "year":"2003", "pdf":"http://pdos.csail.mit.edu/~rtm/papers/search_feasibility.pdf", "pub-type":"inproceedings", "confurl":"http://iptps03.cs.berkeley.edu/", "booktitle":"2nd International Workshop on Peer to Peer Systems", "address":"Berkeley, CA", "origin":"http://service.simile-widgets.org/babel/preview#On%20the%20Feasibility%20of%20Peer-to-Peer%20Web%20Indexing%20and%20Search" }, {"id":"Sticky Notes for the Semantic Web", "label":"Sticky Notes for the Semantic Web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:aa023d2cb2c81145c964c31d74c9aca9", "modified":"no", "note":"Poster.", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the 2003 International Conference on Intelligent User Interfaces", "pages":"254--256", "date":"2003-01", "author":["Karger, David R.","Katz, Boris","Lin, Jimmy","Quan, Dennis"], "url":"citeseer.ist.psu.edu/563406.html", "bibsource":"DBLP, http://dblp.uni-trier.de", "venue":"IUI", "publisher":"ACM", "month":"January", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:StickyNotes-Poster", "year":"2003", "isbn":"1-58113-586-6", "pdf":"http://haystack.lcs.mit.edu/papers/iui2003-annotation.pdf", "pub-type":["misc","poster"], "confurl":"http://www.iuiconf.org/03program.html", "booktitle":"Intelligent User Interfaces", "address":"Miami, FL", "origin":"http://service.simile-widgets.org/babel/preview#Sticky%20Notes%20for%20the%20Semantic%20Web" }, {"id":"da4d24510c68695d82ec7138d1115078", "label":"Randomized Algorithms", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:da4d24510c68695d82ec7138d1115078", "modified":"no", "date":"1997-08", "author":["Goemans, Michel X.","Karger, David R.","Kleinberg, Jon"], "editor":"Mauro Dell'Amico and Francesco Maffioli and Silvano Martello", "publisher":"John Wiley \\& Sons", "month":"August", "cat":"Theory", "key":"Karger:ABCO", "year":"1997", "isbn":"0-471-96574-X", "pub-type":"incollection", "booktitle":"Annotated Bibliographies in Combinatorial Optimization", "origin":"http://service.simile-widgets.org/babel/preview#da4d24510c68695d82ec7138d1115078" }, {"id":"e905a0ee1bea51d869219fa15fa5ce31", "label":"Random Sampling in Cut, Flow, and Network Design Problems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e905a0ee1bea51d869219fa15fa5ce31", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$26^{th}$} {ACM} Symposium on Theory of Computing", "abstract":"We use random sampling as a tool for solving undirected graph problems. We show that the sparse graph, or skeleton, that arises when we randomly sample a graph's edges will accurately approximate the value of all cuts in the original graph with high probability. This makes sampling effective for problems involving cuts in graphs. We present fast randomized (Monte Carlo and Las Vegas) algorithms for approximating and exactly finding minimum cuts and maximum flows in unweighted, undirected graphs. Our cut-approximation algorithms extend unchanged to weighted graphs while our weighted-graph flow algorithms are somewhat slower. Our approach gives a general paradigm with potential applications to any packing problem. It has since been used in a near-linear time algorithm for finding minimum cuts, as well as faster cut and flow algorithms. Our sampling theorems also yield faster algorithms for several other cut-based problems, including approximating the best balanced cut of a graph, finding a k-connected orientation of a 2k-connected graph, and finding integral multicommodity flows in graphs with a great deal of excess capacity. Our methods also improve the efficiency of some parallel cut and flow algorithms. Our methods also apply to the network design problem, where we wish to build a network satisfying certain connectivity requirements between vertices. We can purchase edges of various costs and wish to satisfy the requirements at minimum total cost. Since our sampling theorems apply even when the sampling probabilities are different for different edges, we can apply randomized rounding to solve network design problems. This gives approximation algorithms that guarantee much better approximations than previous algorithms whenever the minimum connectivity requirement is large. As a particular example, we improve the best approximation bound for the minimum k-connected subgraph problem from 1.85 to [math not displayed].", "pages":"383--413", "date":"1999-05", "ps":"http://people.csail.mit.edu/karger/Papers/skeleton-journal.ps", "author":"Karger, David R.", "volume":"24", "month":"May", "cat":["Theory","Cuts and Flows"], "journal":"Mathematics of Operations Research", "key":"Karger:Skeleton", "year":"1999", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#e905a0ee1bea51d869219fa15fa5ce31" }, {"id":"2321e2e18682933b8b1f9fa65c1be1aa", "label":"Augmenting Undirected Edge Connectivity in {$\\Olog(n^2)$} Time", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:2321e2e18682933b8b1f9fa65c1be1aa", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"San Fransisco, CA", "abstract":"We give improved randomized algorithms for the undirected edge splitting and connectivity augmentation problems. Our algorithms are an approximately $O()$ factor faster than the best known deterministic ones. Our runtimes of $O()$ are near-optimal in the sense that even for sparse input graphs the optimum output graph may require $\\Omega()$ edges.", "pages":"500--509", "date":"1998-01", "author":["Bencz{\\'u}r, Andr{\\'a}s A.","Karger, David R."], "editor":"Howard Karloff", "venue":"SODA", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:Augmentation-Conf", "year":"1998", "brag":"Journal version appears in Journal of Algorithms 37", "pdf":"http://people.csail.mit.edu/karger/Papers/augment.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#2321e2e18682933b8b1f9fa65c1be1aa" }, {"id":"A Better Algorithm for an Ancient Scheduling Problem", "label":"A Better Algorithm for an Ancient Scheduling Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:17dc946f84d3137afb2493c6566288f4", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$5^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "abstract":"One of the oldest and simplest variants of multiprocessor scheduling is the on-line scheduling problem studied by Graham in 1966. In this problem, the jobs arrive on-line and must be scheduled non-preemptively on m identical machines so as to minimize the makespan. The size of a job is known on arrival. Graham proved that the List Processing Algorithm which assigns each job to the currently least loaded machine has competitive ratio (2 - l/m). Recently algorithms with smaller competitive ratios than List Processing have been discovered, culminating in Bartal, Fiat, Karloff, and Vohra's construction of an algorithm with competitive ratio bounded away from 2. Their algorithm has a competitive ratio of at most (2 - l/70) w 1.986 for all m; hence for m > 70, their algorithm is provably better than List Processing. We present a more natural algorithm that outperforms List Processing for any m 2 6 and has a competitive ratio of at most 1.945 for all m, which is significantly closer to the best known lower bound of 1.837 for the problem. We show that our analysis of the algorithm is almost tight by presenting a lower bound of 1.9378 on the algorithm's competitive ratio for large m. ", "pages":"400--430", "date":"1996-03", "ps":"http://www.cse.msu.edu/~torng/Research/Pubs/ancient.ps", "author":["Karger, David R.","Phillips, Steven","Torng, Eric"], "volume":"20", "month":"March", "cat":["Theory","Scheduling"], "journal":"Journal of Algorithms", "key":"Karger:Makespan", "year":"1996", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#A%20Better%20Algorithm%20for%20an%20Ancient%20Scheduling%20Problem" }, {"id":"Diminished Chord: A Protocol for Heterogeneous Subgroup Formation in Peer to Peer Systems", "label":"Diminished Chord: A Protocol for Heterogeneous Subgroup Formation in Peer to Peer Systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e6aac9daba8a660e5d2f399b84ab3b73", "modified":"no", "crossref":"Proceedings of the {$15^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "abstract":"In most of the P2P systems developed so far, all nodes play essentially the same role. In some applications, however, different machine capabilities or owner preferences may mean that only a subset of nodes in the system should participate in offering a particular service. Arranging for each service to be supported by a different peer to peer network is, we argue here, a wasteful solution. Instead, we propose a version of the Chord peer-to-peer protocol that allows any subset of nodes in the network to jointly offer a service without forming their own Chord ring. Our variant supports the same efficient join/leave/insert/delete operations that the subgroup would get if they did form their own separate peer to peer network, but requires significantly less resources than the separate network would. For each subgroup of k machines, our protocol uses O(k) additional storage in the primal Chord ring. The insertion or deletion of a node in the subgroup and the lookup of the next node of a subgroup all require O(log n) hops. ", "date":"2004-01", "ps":"Papers/subgroups-iptps.ps", "author":["Karger, David R.","Ruhl, Matthias"], "venue":"SODA", "publisher":"Society for Industrial and Applied Mathematics", "month":"January", "cat":["Theory","P2P"], "key":"Karger:SubRings", "year":"2004", "isbn":"0-89871-558-X", "pdf":"Papers/subgroups-iptps.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$15^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "location":"New Orleans, Louisiana", "address":"Philadelphia, PA, USA", "origin":"http://service.simile-widgets.org/babel/preview#Diminished%20Chord%3A%20A%20Protocol%20for%20Heterogeneous%20Subgroup%20Formation%20in%20Peer%20to%20Peer%20Systems" }, {"id":"Random Sampling in Graph Optimization Problems: A Survey", "label":"Random Sampling in Graph Optimization Problems: A Survey", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:76e09a7471a9d1fe3ec3e1dea4234103", "modified":"no", "abstract":"Randomization has become a pervasive technique in combinatorial optimization. We survey our thesis and subsequent work, which uses four common randomization techniques to attack numerous optimization problems on undirected graphs. ", "pages":"1--11", "date":"1998", "author":"Karger, David R.", "volume":"58", "cat":["Theory","Cuts and Flows"], "journal":"Optima", "key":"Karger:Optima", "year":"1998", "pdf":"http://www.ise.ufl.edu/~optima/optima58.pdf", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Random%20Sampling%20in%20Graph%20Optimization%20Problems%3A%20A%20Survey" }, {"id":"Relo: Helping Users Manage Context during Interactive Exploratory Visualization of Large Codebases", "label":"Relo: Helping Users Manage Context during Interactive Exploratory Visualization of Large Codebases", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:c3cfb5c24738d6ea8ab126d68b5371db", "modified":"no", "abstract":"As software systems grow in size and use more third-party libraries and frameworks, the need for developers to understand unfamiliar large codebases is rapidly increasing. In this paper, we present a tool, Relo, that supports developers' understanding by allowing interactive exploration of code. As the developer explores relationships found in the code, Relo builds and automatically manages the context in a visualization, thereby helping build the developer's mental representation of the code. Developers can group viewed artifacts or use the viewed items to ask Relo for further exploration suggestions, with Relo providing features to limit the growth of the diagram. To ensure developers don't get overwhelmed, Relo has been built with a user-centered approach, and preliminary evaluations with developers exploring new code have shown them to find the tool intuitive and helpful.", "pages":"187-194", "date":"2006-09", "author":["Sinha, Vineet","Karger, David R.","Miller, Robert C."], "venue":"VLHCC", "month":"September", "cat":["Information Retrieval","CHI"], "key":"Karger:Relo", "year":"2006", "pdf":"http://relo.csail.mit.edu/documentation/relo-vlhcc06.pdf", "pub-type":"inproceedings", "booktitle":"VL/HCC: Visual Languages and Human Centered Computing", "location":"Brighton, UK", "origin":"http://service.simile-widgets.org/babel/preview#Relo%3A%20Helping%20Users%20Manage%20Context%20during%20Interactive%20Exploratory%20Visualization%20of%20Large%20Codebases" }, {"id":"3b6a25c9597a15a1a2097d104a538bc8", "label":"Simple efficient load balancing algorithms for peer-to-peer systems", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:3b6a25c9597a15a1a2097d104a538bc8", "modified":"no", "note":"Preliminary versions in IPTPS 2004 and SPAA 2004", "abstract":"Load balancing is a critical issue for the efficient operation of peer-to-peer (P2P) networks. We give two new load-balancing protocols whose provable performance guarantees are within a constant factor of optimal. Our protocols refine the consistent hashing data structure that underlies the Chord (and Koorde) P2P network. Both preserve Chord's logarithmic query time and near-optimal data migration cost. Consistent hashing is an instance of the distributed hash table (DHT) paradigm for assigning items to nodes in a P2P system: items and nodes are mapped to a common address space, and nodes have to store all items residing closeby in the address space. Our first protocol balances the distribution of the key address space to nodes, which yields a load-balanced system when the DHT maps items \"randomly\" into the address space. To our knowledge, this yields the first P2P scheme simultaneously achieving O(log n) degree, O(log n) look-up cost, and constant-factor load balance (previous schemes settled for any two of the three). Our second protocol aims to balance directly the distribution of items among the nodes. This is useful when the distribution of items in the address space cannot be randomized. We give a simple protocol that balances load by moving nodes to arbitrary locations \"where they are needed.\" As an application, we use the last protocol to give an optimal implementation of a distributed data structure for range searches on ordered data. ", "pages":"787--804", "date":"2006-11", "author":["Karger, David R.","Ruhl, Matthias"], "volume":"39", "month":"November", "cat":["Theory","P2P"], "journal":"Theory of Computing Systems", "key":"Karger:P2PLoadBalance-Journal", "year":"2006", "pdf":"Papers/dht-loadbalance-journal.pdf", "pub-type":"article", "number":"6", "origin":"http://service.simile-widgets.org/babel/preview#3b6a25c9597a15a1a2097d104a538bc8" }, {"id":"Enumerating Parametric Global Minimum Cuts by Random Interleaving", "label":"Enumerating Parametric Global Minimum Cuts by Random Interleaving", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e4143e790570bf82662b882c24284f21", "modified":"no", "pages":"542--555", "date":"2016-06", "author":"Karger, David R.", "doi":"http://doi.acm.org/10.1145/2897518.2897578", "acmid":"2897578", "publisher":"ACM", "doin":"10.1145/2897518.2897578", "month":"June", "cat":["Theory","Cuts and Flows"], "series":"STOC 2016", "key":"Karger:ParametricMincut", "numpages":"14", "year":"2016", "isbn":"978-1-4503-4132-5", "pdf":"Papers/parametric-mincut.pdf", "pub-type":"inproceedings", "keywords":["graph algorithms","minimum cuts"], "booktitle":"Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing", "location":"Cambridge, MA, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Enumerating%20Parametric%20Global%20Minimum%20Cuts%20by%20Random%20Interleaving", "abstract": "Recently, Aissi et al. gave new counting and algorithmic bounds for parametric minimum cuts in a graph, where each edge cost is a linear combination of multiple cost criteria and different cuts become minimum as the coefficients of the linear combination are varied. In this article, we derive better bounds using a mathematically simpler argument. We provide faster algorithms for enumerating these cuts. We give a lower bound showing our upper bounds have roughly the right form. Our results also immediately generalize to parametric versions of other problems solved by the Contraction Algorithm, including approximate min-cuts, multi-way cuts, and a matroid optimization problem. We also give a first generalization to nonlinear parametric minimum cuts." }, {"id":"A Unified Abstraction for Messaging on the Semantic Web", "label":"A Unified Abstraction for Messaging on the Semantic Web", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:84842557044699dc472e68a98c80f385", "modified":"no", "crossref":"Proceedings of the $12^{th}$ International World Wide Web Conference", "abstract":"Since its inception, the Internet has been a hotbed of several successful communications channels, starting off with e-mail, Internet Relay Chat and Usenet newsgroups and more recently adding Web annotation, instant messaging, and news feeds. However, these channels were developed fairly independently, and in many cases their respective functionalities have grown to overlap significantly. For instance, users of these systems have separate identifiers for e-mail, chat, and instant messaging, and clients for these systems all have their own implementations of threaded message views. We believe these problems stem from a lack of a common user interface and data model. In this paper we use basic concepts from the Semantic Web and RDF to unify and model these seemingly disparate messaging paradigms. We also demonstrate a generalized user interface for messaging that uses the data model we have developed. From this process we realize a number of synergies that result from the reduction of overlap and the finer-grained control users are given over message composition, transmission, storage and retrieval.", "pages":"231", "date":"2003-05", "author":["Quan, Dennis","Bakshi, Karun","Karger, David R."], "event":"Developer's Day", "pdfkb":"307", "venue":"WWW", "month":"May", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "pdfurl":"http://haystack.lcs.mit.edu/papers/www2003-messaging.pdf", "key":"Haystack:Messaging", "year":"2003", "pub-type":"inproceedings", "confurl":"http://www.www2003.org/", "booktitle":"Developers' day, $12^{th}$ International World Wide Web Conference", "origin":"http://service.simile-widgets.org/babel/preview#A%20Unified%20Abstraction%20for%20Messaging%20on%20the%20Semantic%20Web" }, {"id":"Soylent: A Word Processor with a Crowd Inside", "label":"Soylent: A Word Processor with a Crowd Inside", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:65ea77fbff662c6ed79d5ae08a6bd875", "modified":"no", "pages":"85--94", "date":"2015-07", "author":["Bernstein, Michael S.","Little, Greg","Miller, Robert C.","Hartmann, Bj\\\"{o}rn","Ackerman, Mark S.","Karger, David R.","Crowell, David","Panovich, Katrina"], "url":"http://doi.acm.org/10.1145/2791285", "doi":"10.1145/2791285", "issue_date":"August 2015", "acmid":"2791285", "volume":"58", "publisher":"ACM", "month":"July", "cat":["CHI","Systems","Mechanism Design"], "journal":"Commun. ACM", "key":"Bernstein:SoylentCACM", "numpages":"10", "year":"2015", "pub-type":"article", "issn":"0001-0782", "number":"8", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Soylent%3A%20A%20Word%20Processor%20with%20a%20Crowd%20Inside" }, {"id":"Finding Maximum Flows in Simple Undirected Graphs Seems Faster than Bipartite Matching", "label":"Finding Maximum Flows in Simple Undirected Graphs Seems Faster than Bipartite Matching", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d2c66713e0fabb50b045613c85d1c8c8", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$29^{th}$} {ACM} Symposium on Theory of Computing", "place":"Dallas, TX", "abstract":"Considear nr r-vertex, m-edge, undirected graph with maximum llow value v. We give a method to find augmenting paths in such a graph in amortized sub-linear (O(n@) time per path. This lets us improve the time bound of the classic augmenting path algorithm to O(m + nvsi2) on simple graphs. The addition of a blocking flow subroutine gives a simple, deterministic O(nm2/3v1/6)-time algorithm, We also use our technique to improve known randomized algorithms, giving @rtr+nv5/4)-time and d(m+-nt'~gv)-time algorithms for capacitated undirected graphs.- For simple graphs, in which v s II, the last bound is a(n2s2), improving on the best previous bound of O(n2*5), which is also the best known time bound for bipartite matching.", "pages":"69--78", "date":"1998-05", "ps":"http://people.csail.mit.edu/karger/Papers/stoc98.ps", "author":["Karger, David R.","Levine, Matthew"], "venue":"STOC", "publisher":"ACM Press", "month":"May~23--26", "cat":["Theory","Cuts and Flows"], "key":"Karger:SimpleFlow", "year":"1998", "isbn":"0-89791-962-9", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$29^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "address":"New York", "origin":"http://service.simile-widgets.org/babel/preview#Finding%20Maximum%20Flows%20in%20Simple%20Undirected%20Graphs%20Seems%20Faster%20than%20Bipartite%20Matching" }, {"id":"Using Randomized Sparsification to Approximate Minimum Cuts", "label":"Using Randomized Sparsification to Approximate Minimum Cuts", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a5dcabcdb1efd610b570f5279e74d2e1", "modified":"no", "crossref":"Proceedings of the {$5^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Arlington, VA", "abstract":"We develop new parallel and dynamic algorithms for approximating minimum cuts in weighted, undirected graphs. Our approach is to combine new algorithms for sparse unweighted graphs with a reduction for dense weighted graphs called {\\em randomized sparsification}. Randomized sparsification yields a sparse unweighted graph that closely approximates the minimum cut structure of the original graph. In~\\cite{Karger:Skeleton}, this techniques was used in sequential algorithms for approximating the minimum cut. In this paper we devise parallel and dynamic approximation algorithms. We show that a cut within a multiplicative factor of $\\alpha$ of the minimum can be found in $\\RNC$ using $m+n^{2/\\alpha}$ processors. Using similar techniques, we give a {\\em dynamic approximation algorithm} for a graph undergoing a series of edge insertions and deletions. At a cost of $\\Olog(n/\\epsilon^2)$ time per insertion or deletion, the algorithm will maintain a cut with value at most $(1+\\epsilon)$ times the minimum. We also consider a functional inverse of randomized sparsification, and use it to develop a different dynamic algorithm that approximates the value of the minimum cut more quickly than the previous algorithm, but does not actually exhibit a cut of small value. An $O(\\sqrt{1+2/\\epsilon})$-approximation to the minimum cut value is maintained at a cost of $\\Olog(n^{\\epsilon+1/2})$ time per insertion or deletion. If only insertions are allowed, the approximation can be maintained at a cost of $\\Olog(n^{\\epsilon})$ time per insertion.", "pages":"424--432", "date":"1994-01", "ps":"Papers/approxcut.ps", "author":"Karger, David R.", "editor":"Daniel D. Sleator", "venue":"SODA", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:Approxcut", "year":"1994", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$5^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "psgz":"Papers/approxcut.ps.gz", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Using%20Randomized%20Sparsification%20to%20Approximate%20Minimum%20Cuts" }, {"id":"Distributed Quota Enforcement for Spam Control", "label":"Distributed Quota Enforcement for Spam Control", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:745ebde354a358bb9e72d919d85e35bb", "modified":"no", "abstract":"Spam, by overwhelming inboxes, has made email a less reliable medium than it was just a few years ago. Spam filters are undeniably useful but unfortunately can flag non-spam as spam. To restore email's reliability, a recent spam control approach grants quotas of stamps to senders and has the receiver communicate with a wellknown quota enforcer to verify that the stamp on the email is fresh and to cancel the stamp to prevent reuse. The literature has several proposals based on this general idea but no complete system design and implementation that: scales to today's email load (which requires the enforcer to be distributed over many hosts and to tolerate faults in them), imposes minimal trust assumptions, resists attack, and upholds today's email privacy. This paper describes the design, implementation, analysis, and experimental evaluation of DQE, a spam control system that meets these challenges. DQE's enforcer occupies a point in the design spectrum notable for simplicity: mutually untrusting nodes implement a storage abstraction but avoid neighbor maintenance, replica maintenance, and heavyweight cryptography.", "date":"2006-05", "author":["Walfish, Michael","Zamfirescu, J. D.","Balakrishnan, Hari","Karger, David R.","Shenker, Scott"], "venue":"NSDI", "month":"May", "cat":["Systems","P2P"], "key":"Karger:DQE", "year":"2006", "pdf":"http://nms.csail.mit.edu/papers/dqe-nsdi06.pdf", "pub-type":"inproceedings", "booktitle":"NSDI: Networking Systems Design and Implementation", "origin":"http://service.simile-widgets.org/babel/preview#Distributed%20Quota%20Enforcement%20for%20Spam%20Control" }, {"id":"Online Reading Informs Classroom Instruction and Promotes Collaborative Learning", "label":"Online Reading Informs Classroom Instruction and Promotes Collaborative Learning", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:915c07ffd42b30b2b6a5a79a051e7079", "modified":"no", "date":"2013-12", "author":["Wright, L. Kate","Newman, Dina L.","Zyto, Sacha","Karger, David R."], "url":"http://digital.nsta.org/publication/?i=178478&p=46", "volume":"43", "publisher":"National Science Teacher's Association", "date_0":"2013-12", "month":"December", "cat":["CHI","Education"], "journal":"Journal of College Science Teaching", "key":"Karger:BioNB", "year":"2013", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#Online%20Reading%20Informs%20Classroom%20Instruction%20and%20Promotes%20Collaborative%20Learning" }, {"id":"4b5d30682fa2a5d5d8374a9718b4edd9", "label":"Random Sampling in Graph Optimization Problems: A Survey", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4b5d30682fa2a5d5d8374a9718b4edd9", "modified":"no", "abstract":"Randomization has become a pervasive technique in combinatorial optimization. We survey our thesis and subsequent work, which uses four common randomization techniques to attack numerous optimization problems on undirected graphs. 1 Introduction Randomization has become a pervasive technique in combinatorial optimization. Randomization has been used to develop algorithms that are faster, simpler, and/or better-performing than previous deterministic algorithms.", "date":"2001", "basefilename":"random", "ps":"Papers/random.ps", "author":"Karger, David R.", "editor":"S. Rajasekaran and P. Pardalos and J.H. Reif and J. Rolim", "publisher":"Kluwer Academic Press", "cat":"Theory", "key":"Karger:RandomizationHandbook", "year":"2001", "pub-type":"incollection", "booktitle":"Handbook on Randomization", "psgz":"Papers/random.ps.gz", "origin":"http://service.simile-widgets.org/babel/preview#4b5d30682fa2a5d5d8374a9718b4edd9" }, {"id":"On Randomized Network Coding", "label":"On Randomized Network Coding", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cc2f44154e5faaef38a309314212f8f7", "modified":"no", "note":"Invited paper", "abstract":"We consider a randomized network coding approach for multicasting from several sources over a network, in which nodes independently and randomly select linear mappings from inputs onto output links over some field. This approach was first described in [3], which gave, for acyclic delay-free networks, a bound on error probability, in terms of the number of receivers and random coding output links, that decreases exponentially with code length. The proof was based on a result in [2] relating algebraic network coding to network flows. In this paper, we generalize these results to networks with cycles and delay. We also show, for any given acyclic network, a tighter bound in terms of the probability of connection feasibility in a related network problem with unreliable links. From this we obtain a success probability bound for randomized network coding in link-redundant networks with unreliable links, in terms of link failure probability and amount of redundancy.", "date":"2003-10", "author":["Ho, Tracey","M\\'{e}dard, Muriel","Shi, J.","Effros, Michelle","Karger, David R."], "venue":"Allerton", "month":"October", "cat":["Theory","Applications of Theory","Coding","Cuts and Flows"], "key":"Karger:NCoding1", "year":"2003", "pdf":"http://www.its.caltech.edu/~tho/allerton.pdf", "pub-type":"inproceedings", "booktitle":"$41^{st}$ Allerton Annual Conference on Communication, Control, and Signal Processing", "origin":"http://service.simile-widgets.org/babel/preview#On%20Randomized%20Network%20Coding" }, {"id":"{(De)randomized} Construction of Small Sample Spaces in~{$\\mathcal{NC}$}", "label":"{(De)randomized} Construction of Small Sample Spaces in~{$\\mathcal{NC}$}", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:26610b2769fedf192c890ce8c4f9d21d", "modified":"no", "abstract":"Koller and Megiddo introduced the paradigm of constructing compact distributions that satisfy a given set of constraints and showed how it can be used to efficiently derandomize certain types of algorithms. In this paper, we significantly extend their results in two ways. First, we show how their approach can be applied to deal with more generalexpectation constraints. More importantly, we provide the firstparallel(NC) algorithm for constructing a compact distribution that satisfies the constraints up to a smallrelativeerror. This algorithm deals with constraints over any event that can be verified by finite automata, including allindependence constraintsas well as constraints over events relating to the parity or sum of a certain set of variables. Our construction relies on a new and independently interesting parallel algorithm for converting a solution to a linear system into an almost basic approximate solution to the same system. We use these techniques in the first NC derandomization of an algorithm for constructing large independent sets ind-uniform hypergraphs forarbitrary d. We also show how the linear programming perspective suggests new proof techniques which might be useful in general probabilistic analysis.", "pages":"402--413", "date":"1997-12", "ps":"http://people.csail.mit.edu/karger/Papers/random.ps", "author":["Karger, David R.","Koller, Daphne"], "preliminary":"focs::KargerK1994", "volume":"55", "month":"December", "cat":"Theory", "journal":"Journal of Computer and System Sciences", "key":"Karger:Random", "year":"1997", "brag":"Special issue of selected papers from Proceedings of the {$35^{th}$} Annual Symposium on the Foundations of Computer Science", "pub-type":"article", "number":"3", "origin":"http://service.simile-widgets.org/babel/preview#%7B(De)randomized%7D%20Construction%20of%20Small%20Sample%20Spaces%20in~%7B%24%5Cmathcal%7BNC%7D%24%7D" }, {"id":"The web page as a WYSIWYG end-user customizable database-backed information management application", "label":"The web page as a WYSIWYG end-user customizable database-backed information management application", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6abf0fb136b5d58710880aed2074196c", "modified":"no", "abstract":"Dido is an application (and application development environment) in a web page. It is a single web page containing rich structured data, an AJAXy interactive visualizer/editor for that data, and a ``metaeditor'' for WYSIWYG editing of the visualizer/editor. Historically, users have been limited to the data schemas, visualizations, and interactions offered by a small number of heavyweight applications. In contrast, Dido encourages and enables the end user to edit (not code) in his or her web browser a distinct ephemeral interaction ``wrapper'' for each data collection that is specifically suited to its intended use. Dido's \\emph{active document} metaphor has been explored before but we show how, given today's web infrastructure, it can be deployed in a small self-contained HTML document without touching a web client or server.", "pages":"257--260", "date":"2009-10", "author":["Karger, David R.","Ostler, Scott","Lee, Ryan"], "url":"http://projects.csail.mit.edu/exhibit/Dido/", "doi":"http://doi.acm.org/10.1145/1622176.1622223", "venue":"UIST", "publisher":"ACM", "month":"October", "cat":["CHI","Haystack","Semantic Web","Systems","Visualization"], "hideaddress":"New York, NY, USA", "key":"Karger:DIDO", "year":"2009", "isbn":"978-1-60558-745-5", "pdf":"Papers/dido.pdf", "pub-type":"inproceedings", "booktitle":"UIST '09: Proceedings of the 22nd annual ACM symposium on User interface software and technology", "location":"Victoria, BC, Canada", "origin":"http://service.simile-widgets.org/babel/preview#The%20web%20page%20as%20a%20WYSIWYG%20end-user%20customizable%20database-backed%20information%20management%20application" }, {"id":"Adding Multiple Cost Constraints to Combinatorial Optimization Problems, with Applications to Multicommodity Flows", "label":"Adding Multiple Cost Constraints to Combinatorial Optimization Problems, with Applications to Multicommodity Flows", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:7a6bf65c4a9f6fe8c9a13c5866e18d75", "modified":"no", "crossref":"Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "place":"Las Vegas, NV", "abstract":"Minimum cost multicommodity flow is an instance of a simpler problem (multicommodity flow) to which a cost constraint has been added. In this paper we present a general scheme for solving a large class of such ``cost-added'' problems---even if more than one cost is added. One of the main applications of this method is a new deterministic algorithm for approximately solving the minimumcost multicommodity flow problem. Our algorithm finds a (1 + e) approximation to the minimum cost flow in 0(e-3kmn) time, where k is the number of commodities, m is the number of edges, and n is the number vertices in the input problem. This improves the previous best deterministic bounds of O(e-4kmn2 ) [9] and 6(e-2k2m2) [15] by f~ctors of n/6 and ekm/n respectively. In fact, it even dominates the best randomized bound of 0(e-2km2) [15]. The algorithm presented in this paper efficiently solves several other interesting generalizations of rein-cost flow problems, such as one in which each commodity can have its own distinct shipping cost per edge, or one in which there is more than one cost measure on the flows and all costs must be kept small simultaneously. Our approach is based on an extension of the approximate packing techniques in [15] and a generalization of the round-robin approach of [16] to multicommodity flow without costs.", "pages":"18--25", "date":"1995-05", "ps":"Papers/packing.ps", "author":["Karger, David R.","Plotkin, Serge"], "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":"Theory", "key":"Karger:Packing", "year":"1995", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#Adding%20Multiple%20Cost%20Constraints%20to%20Combinatorial%20Optimization%20Problems%2C%20with%20Applications%20to%20Multicommodity%20Flows" }, {"id":"Potluck: Data Mash-Up Tool for Casual Users", "label":"Potluck: Data Mash-Up Tool for Casual Users", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:15b2968375d33a3c093b27abc637712e", "modified":"no", "note":"to appear", "crossref":"$6^{th}$ International Semantic Web Conference (ISWC)", "abstract":"As more and more reusable structured data appears on the Web, casual users will want to take into their own hands the task of mashing up data rather than wait for mash-up sites to be built that address exactly their individually unique needs. In this paper, we present Potluck, a Web user interface that lets casual users?those without programming skills and data modeling expertise?mash up data themselves.
Potluck is novel in its use of drag and drop for merging fields, its integration and extension of the faceted browsing paradigm for focusing on subsets of data to align, and its application of simultaneous editing for cleaning up data syntactically. Potluck also lets the user construct rich visualizations of data in-place as the user aligns and cleans up the data. This iterative process of integrating the data while constructing useful visualizations is desirable when the user is unfamiliar with the data at the beginning?a common case?and wishes to get immediate value out of the data without having to spend the overhead of completely and perfectly integrating the data first.
A user study on Potluck indicated that it was usable and learnable, and elicited excitement from programmers who, even with their programming skills, previously had great difficulties performing data integration.
", "date":"2007-11", "author":["Huynh, David","Miller, Robert","Karger, David R."], "work-done-at":"MIT CSAIL", "screencastkb":"67274", "screencasturl":"http://people.csail.mit.edu/dfhuynh/research/media/iswc2007/potluck-screencast.mov", "pdfkb":"1677", "venue":"ISWC", "month":"November", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Potluck", "year":"2007", "pdf":"http://people.csail.mit.edu/dfhuynh/research/papers/iswc2007-potluck.pdf", "pub-type":"inproceedings", "project":"Potluck", "booktitle":"$6^{th}$ International Semantic Web Conference (ISWC)", "projectsite":"http://simile.mit.edu/potluck/", "origin":"http://service.simile-widgets.org/babel/preview#Potluck%3A%20Data%20Mash-Up%20Tool%20for%20Casual%20Users" }, {"id":"User Interface Continuations", "label":"User Interface Continuations", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1e97dddce2ae1534378ab55f5a8a5967", "modified":"no", "crossref":"The $16^{th}$ Annual Symposium on User Interface Software and Technology", "abstract":"Dialog boxes that collect parameters for commands often create ephemeral, unnatural interruptions of a program's normal execution flow, encouraging the user to complete the dialog box as quickly as possible in order for the program to process that command. In this paper we examine the idea of turning the act of collecting parameters from a user into a first class object called a user interface continuation. Programs can create user interface continuations by specifying what information is to be collected from the user and supplying a callback (i.e., a continuation) to be notified with the collected information. A partially completed user interface continuation can be saved as a new command, much as currying and partially evaluating a function with a set of parameters produces a new function. Furthermore, user interface continuations, like other continuation-passing paradigms, can be used to allow program execution to continue uninterrupted while the user determines a command's parameters at his or her leisure.", "date":"2003-11", "author":["Quan, Dennis","Huynh, David","Karger, David R.","Miller, Robert"], "pdfkb":"111", "venue":"UIST", "month":"November", "cat":["Information Retrieval","Haystack","Semantic Web","CHI"], "key":"Karger:UIContinuations", "year":"2003", "pdf":"http://haystack.csail.mit.edu/documents/papers/2003/uist2003-uicont.pdf", "pub-type":"inproceedings", "conferenceurl":"http://www.uist.org/", "booktitle":"The $16^{th}$ Annual Symposium on User Interface Software and Technology", "organization":"ACM", "address":"Vancouver, BC", "origin":"http://service.simile-widgets.org/babel/preview#User%20Interface%20Continuations" }, {"id":"{Spam-I-am}: A Proposal for Spam Control Using Distributed Quota Management", "label":"{Spam-I-am}: A Proposal for Spam Control Using Distributed Quota Management", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f758ce2b38c00664a3f18cf7e40d3578", "modified":"no", "abstract":"Email spam has reached alarming proportions because it costs virtually nothing to send email; even a small number of people responding to a spam message is adequate incentive for a spammer to send as many messages as possible. Since spammers need to send messages at high rates to as many recipients as they can, quotas on email senders could throttle spam. We argue for separating the allocation of quotas, a relatively rare activity, from the enforcement of quotas, a frequent activity that must scale to the billions of messages sent daily. This paper tackles the quota enforcement problem, where the goal is to ensure that no sender can grossly violate its quota. The challenge is to design an enforcement scheme that is scalable, is robust against malicious attackers or participants, and preserves the privacy of communication, in a large, distributed, and untrusted environment. We discuss the design of such a system, Spam-Iam, based on a managed distributed hash table (DHT) interface, showing that it can be used in conjunction with electronic stamps (for quota allocation) to ensure that any non-negligible reuse of stamps will be detected.", "date":"2004-11", "author":["Balakrishnan, Hari","Karger, David R."], "editor":"Alex Snoeren", "venue":"HotNets", "month":"November", "cat":["Systems","P2P"], "key":"Karger:Spam", "year":"2004", "pdf":"http://ramp.ucsd.edu/conferences/HotNets-III/HotNets-III%20Proceedings/spamiam.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the Third Annual ACM SIGCOMM Workshop on Hot Topics in Networking ({HotNets-III})", "address":"San Diego, CA", "origin":"http://service.simile-widgets.org/babel/preview#%7BSpam-I-am%7D%3A%20A%20Proposal%20for%20Spam%20Control%20Using%20Distributed%20Quota%20Management" }, {"id":"Processing and visualizing the data in tweets", "label":"Processing and visualizing the data in tweets", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:972717137e66071abb3674311b7ce8f5", "modified":"no", "pages":"21--27", "date":"2012-01", "author":["Marcus, Adam","Bernstein, Michael S.","Badar, Osama","Karger, David R.","Madden, Samuel","Miller, Robert C."], "url":"http://doi.acm.org/10.1145/2094114.2094120", "doi":"10.1145/2094114.2094120", "issue_date":"December 2011", "acmid":"2094120", "volume":"40", "publisher":"ACM", "month":"January", "cat":["CHI","Systems","Information Retrieval"], "journal":"SIGMOD Record", "key":"Karger:TwitinfoSigmod", "numpages":"7", "year":"2012", "pub-type":"article", "issn":"0163-5808", "number":"4", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Processing%20and%20visualizing%20the%20data%20in%20tweets" }, {"id":"5b1f2025821e6980014e0b97839b78c7", "label":"Subjective Cost Policy Routing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:5b1f2025821e6980014e0b97839b78c7", "modified":"no", "abstract":"We study a model of path-vector routing in which nodes' routing policies are based on subjective cost assessments of alternative routes. The routes are constrained by the requirement that all routes to a given destination must be confluent. We show that it is NP-hard to determine whether there is a set of stable routes. We also show that it is NP-hard to find a set of confluent routes that minimizes the total subjective cost; it is hard even to approximate the minimum cost closely. These hardness results hold even for very restricted classes of subjective costs. We then consider a model in which the subjective costs are based on the relative importance nodes place on a small number of objective cost measures. We show that a small number of confluent routing trees is sufficient for each node to have a route that nearly minimizes its subjective cost. We show that this scheme is trivially strategy proof and that it can be computed easily with a distributed algorithm. Furthermore, we prove a lower bound on the number of trees required to contain a (1+epsilon (Porson))-approximately optimal route for each node and show that our scheme is nearly optimal in this respect. ", "pages":"174--183", "date":"2005-12", "author":["Feigenbaum, Joan","Karger, David R.","Mirrokni, Vahab S.","Sami, Rahul"], "doi":"http://dx.doi.org/10.1007/11600930_18", "venue":"WINE", "month":"December", "cat":["Theory","Mechanism Design"], "key":"Karger:SubjectiveCostRouting", "year":"2005", "pub-type":"inproceedings", "booktitle":"Internet and Network Economics: First International Workshop, WINE 2005", "origin":"http://service.simile-widgets.org/babel/preview#5b1f2025821e6980014e0b97839b78c7" }, {"id":"Haystack: A General Purpose Information Management Tool for End Users of Semistructured Data", "label":"Haystack: A General Purpose Information Management Tool for End Users of Semistructured Data", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:508d7952cecf98a7ed749c26bcddc096", "modified":"no", "abstract":"We posit that a semistructured data model offers the right balance of rich structure and flexible (or lack of) schema allowing naive end users to record information in whatever form makes it easy for them to manage. We describe our Haystack system, which exposes the richness and flexibility of the data model while offering the user natural, traditional interfaces that shield them from the specifics of schemas, tuples, and database queries. We outline research challenges that remain to be addressed.", "pages":"13--26", "date":"2005-01", "author":["Karger, David R.","Bakshi, Karun","Huynh, David","Quan, Dennis","Sinha, Vineet"], "work-done-at":"MIT CSAIL", "venue":"CIDR", "month":"January", "cat":["Haystack","Information Retrieval"], "key":"karger:Haystack-CIDR", "year":"2005", "pdf":"http://www-db.cs.wisc.edu/cidr/cidr2005/papers/P02.pdf", "pub-type":"inproceedings", "confurl":"http://www-db.cs.wisc.edu/cidr/cidr2005/", "booktitle":"Conference on Innovative Database Research (CIDR)", "origin":"http://service.simile-widgets.org/babel/preview#Haystack%3A%20A%20General%20Purpose%20Information%20Management%20Tool%20for%20End%20Users%20of%20Semistructured%20Data" }, {"id":"6ca618ece4cab98fa84adca4313b2dc3", "label":"A Better Algorithm for an Ancient Scheduling Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6ca618ece4cab98fa84adca4313b2dc3", "modified":"no", "note":"Journal version appears in Journal of Algorithms 20", "crossref":"Proceedings of the {$5^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Arlington, VA", "abstract":"One of the oldest and simplest variants of multiprocessor scheduling is the on-line scheduling problem studied by Graham in 1966. In this problem, the jobs arrive on-line and must be scheduled non-preemptively on m identical machines so as to minimize the makespan. The size of a job is known on arrival. Graham proved that the List Processing Algorithm which assigns each job to the currently least loaded machine has competitive ratio (2 - l/m). Recently algorithms with smaller competitive ratios than List Processing have been discovered, culminating in Bartal, Fiat, Karloff, and Vohra's construction of an algorithm with competitive ratio bounded away from 2. Their algorithm has a competitive ratio of at most (2 - l/70) w 1.986 for all m; hence for m > 70, their algorithm is provably better than List Processing. We present a more natural algorithm that outperforms List Processing for any m 2 6 and has a competitive ratio of at most 1.945 for all m, which is significantly closer to the best known lower bound of 1.837 for the problem. We show that our analysis of the algorithm is almost tight by presenting a lower bound of 1.9378 on the algorithm's competitive ratio for large m. ", "pages":"132--140", "date":"1994-01", "ps":"http://people.csail.mit.edu/karger/Papers/makespan.ps", "author":["Karger, David R.","Phillips, Steven","Torng, Eric"], "editor":"Daniel D. Sleator", "venue":"SODA", "month":"January", "cat":["Theory","Scheduling"], "key":"Karger:Makespan-Conf", "year":"1994", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$5^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#6ca618ece4cab98fa84adca4313b2dc3" }, {"id":"Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them?", "label":"Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them?", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:0c4520811139d0030e5c868bbc510c7d", "modified":"no", "pages":"4009--4018", "date":"2015-04", "author":["Zhang, Amy X.","Ackerman, Mark S.","Karger, David R."], "url":"http://doi.acm.org/10.1145/2702123.2702194", "doi":"10.1145/2702123.2702194", "acmid":"2702194", "venue":"CHI", "publisher":"ACM", "month":"April", "cat":["CHI","Ethnography"], "series":"CHI '15", "key":"Karger:MailingLists", "numpages":"10", "year":"2015", "isbn":"978-1-4503-3145-6", "pub-type":"inproceedings", "keywords":["discussion groups","email","mailing lists","online communities"], "booktitle":"Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems", "location":"Seoul, Republic of Korea", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Mailing%20Lists%3A%20Why%20Are%20They%20Still%20Here%2C%20What's%20Wrong%20With%20Them%2C%20and%20How%20Can%20We%20Fix%20Them%3F" }, {"id":"Scheduling Trees with Communication and Precedence Delays", "label":"Scheduling Trees with Communication and Precedence Delays", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:765d84ffe77cd6e06495b57c04b8b0a4", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$12^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Washington, DC", "date":"2001-01", "ps":"Papers/engels01parallel.ps", "author":["Engels, Daniel W.","Feldman, Jon","Karger, David R.","Ruhl, Matthias"], "editor":"S. Rao Kosaraju", "venue":"SODA", "month":"January", "cat":"Theory", "key":"Karger:Pipeline", "year":"2001", "pub-type":"inproceedings", "confurl":"http://portal.acm.org/toc.cfm?id=365411&dl=GUIDE&dl=ACM&type=proceeding&idx=SERIES422&part=Proceedings&WantType=Proceedings", "booktitle":"Proceedings of the {$12^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Scheduling%20Trees%20with%20Communication%20and%20Precedence%20Delays" }, {"id":"Soylent: a word processor with a crowd inside", "label":"Soylent: a word processor with a crowd inside", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4575b7e2919fc3f43cdb3a426f1d4291", "modified":"no", "abstract":"This paper introduces architectural and interaction patterns for integrating crowdsourced human contributions directly into user interfaces. We focus on writing and editing, complex endeavors that span many levels of conceptual and pragmatic activity. Authoring tools offer help with pragmatics, but for higher-level help, writers commonly turn to other people. We thus present Soylent, a word processing interface that enables writers to call on Mechanical Turk workers to shorten, proofread, and otherwise edit parts of their documents on demand. To improve worker quality, we introduce the Find-Fix-Verify crowd programming pattern, which splits tasks into a series of generation and review stages. Evaluation studies demonstrate the feasibility of crowdsourced editing and investigate questions of reliability, cost, wait time, and work time for edits.", "pages":"313--322", "date":"2010-11", "author":["Bernstein, Michael S.","Little, Greg","Miller, Robert C.","Hartmann, Bj\\\"{o}rn","Ackerman, Mark S.","Karger, David R.","Crowell, David","Panovich, Katrina"], "doi":"http://doi.acm.org/10.1145/1866029.1866078", "publisher":"ACM", "month":"November", "cat":["CHI","Systems","Mechanism Design"], "key":"Karger:Soylent", "year":"2010", "isbn":"978-1-4503-0271-5", "pdf":"http://people.csail.mit.edu/msbernst/papers/soylent-uist2010.pdf", "pub-type":"inproceedings", "booktitle":"UIST '10: Proceedings of the 23nd annual ACM symposium on User interface software and technology", "location":"New York, New York, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Soylent%3A%20a%20word%20processor%20with%20a%20crowd%20inside" }, {"id":"Experimental Study of Minimum Cut Algorithms", "label":"Experimental Study of Minimum Cut Algorithms", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:ce03d3476b56d2d0cd86629cf8367d4d", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "crossref":"Proceedings of the {$8^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"New Orleans, LA", "abstract":"Recently, several new algorithms have been developed for the minimum cut problem. These algorithms are very different from the earlier ones and from each other and substantially improve the worst-case time bounds for the problem. In this paper, we conduct experimental evaluation the relative performance of these algorithms. In the process, we develop heuristics and data structures that substantially improve the practical performance of the algorithms. We also develop problem families for testing minimum cut algorithms. Our work leads to a better understanding of the practical performance of minimum cut algorithms and produces very efficient codes for the problem.", "pages":"324--333", "date":"1997-01", "author":["Chekuri, Chandra C.","Goldberg, Andrew V.","Karger, David R.","Levine, Matthew S.","Stein, Cliff"], "editor":"Michael Saks", "venue":"SODA", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:ImpCut", "year":"1997", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$8^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Experimental%20Study%20of%20Minimum%20Cut%20Algorithms" }, {"id":"Improved Approximations for Multiprocessor Scheduling Under Uncertainty", "label":"Improved Approximations for Multiprocessor Scheduling Under Uncertainty", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:99f83c4b853573c10952c79febff2fe6", "modified":"no", "date":"2008-06", "author":["Crutchfield, Christopher Y.","Dzunic, Zoran","Fineman, Jeremy T.","Karger, David R.","Scott, Jacob H."], "venue":"SPAA", "month":"June", "cat":"Theory", "key":"Karger:SchedulingUncertainty", "year":"2008", "pub-type":"inproceedings", "booktitle":"Proceedings of the Twentieth ACM Symposium on Parallelism in Algorithms and Architectures (SPAA)", "organization":"ACM", "address":"Munich, Germany", "origin":"http://service.simile-widgets.org/babel/preview#Improved%20Approximations%20for%20Multiprocessor%20Scheduling%20Under%20Uncertainty" }, {"id":"Opportunities and Challenges Around a Tool for Social and Public Web Activity Tracking", "label":"Opportunities and Challenges Around a Tool for Social and Public Web Activity Tracking", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a65cf3ef505d6713b5dce1d5b5fd0d47", "modified":"no", "pages":"913--925", "date":"2016-02", "author":["Zhang, Amy X.","Blum, Joshua","Karger, David R."], "doi":"http://doi.acm.org/10.1145/2818048.2819949", "acmid":"2819949", "publisher":"ACM", "doin":"10.1145/2818048.2819949", "month":"February", "cat":"CHI", "series":"CSCW '16", "key":"Karger:Eyebrowse", "numpages":"13", "year":"2016", "isbn":"978-1-4503-3592-8", "pdf":"Papers/eyebrowse.pdf", "pub-type":"inproceedings", "keywords":["activity traces","privacy","self-presentation","sharing motivations","social media","web analytics","web browsing","web tracking"], "booktitle":"Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work \\& Social Computing", "location":"San Francisco, California, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Opportunities%20and%20Challenges%20Around%20a%20Tool%20for%20Social%20and%20Public%20Web%20Activity%20Tracking", "abstract": "While the web contains many social websites, people are generally left in the dark about the activities of other people traversing the web as a whole. In this paper, we explore the potential benefits and privacy considerations around generating a real-time, publicly accessible stream of web activity where users can publish chosen parts of their web browsing data. Taking inspiration from social media systems, we describe individual benefits that can be unlocked by such sharing and that may incentivize users to publish aspects of their browsing. We ask whether and how these benefits outweigh potential costs in lost privacy. We conduct our study of public web activity sharing through scenario-based interviews and a field deployment of a tool for web activity sharing." }, {"id":"447fbc29b038fc3a64b07bbed265eb37", "label":"A Randomized Fully Polynomial Approximation Scheme for the All Terminal Network Reliability Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:447fbc29b038fc3a64b07bbed265eb37", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing. This corrects a version published in SICOMP", "abstract":"The classic all-terminal network reliability problem posits a graph, each of whose edges fails independently with some given probability. The goal is to determine the probability that the network becomes disconnected due to edge failures. This problem has obvious applications in the design of communication networks. Since the problem is ?P-complete and thus believed hard to solve exactly, a great deal of research has been devoted to estimating the failure probability. In this paper, we give a fully polynomial randomized approximation scheme that, given any n-vertex graph with specified failure probabilities, computes in time polynomial in n and 1/e an estimate for the failure probability that is accurate to within a relative error of 1 औ e with high probability. We also give a deterministic polynomial approximation scheme for the case of small failure probabilities. Some extensions to evaluating probabilities of k-connectivity, strong connectivity in directed Eulerian graphs and r-way disconnection, and to evaluating the Tutte polynomial are also described. This version of the paper corrects several errata that appeared in the previous journal publication [D. R. Karger, SIAM J. Comput., 29 (1999), pp. 492-514].", "pages":"499--522", "date":"2001", "ps":"Papers/reliability-sirev.ps", "author":"Karger, David R.", "volume":"43", "cat":["Theory","Cuts and Flows"], "journal":"SIAM Review", "key":"Karger:Reliability-SIREV", "year":"2001", "brag":"Winner, SIAM Outstanding Paper Prize, 2000", "pdf":"Papers/reliability-sirev.pdf", "pub-type":"article", "number":"3", "origin":"http://service.simile-widgets.org/babel/preview#447fbc29b038fc3a64b07bbed265eb37" }, {"id":"271a88026e949c13c07c70598144196d", "label":"Approximation Algorithms for Orienteering and Discounted-Reward TSP", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:271a88026e949c13c07c70598144196d", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$36^{th}$} Annual Symposium on the Foundations of Computer Science", "pages":"653-670", "date":"2007", "author":["Blum, Avrim","Chawla, Shuchi","Karger, David R.","Lane, Terran","Meyerson, Adam","Minkoff, Maria"], "doi":"http://dx.doi.org/10.1137/050645464", "volume":"37", "cat":"Theory", "journal":"SIAM Journal on Computing", "key":"Karger:DiscountTSP-Journal", "year":"2007", "pdf":"orienteering-sicomp.pdf", "pub-type":"article", "number":"2", "origin":"http://service.simile-widgets.org/babel/preview#271a88026e949c13c07c70598144196d" }, {"id":"A near-linear time algorithm for constructing a cactus representation of minimum cuts", "label":"A near-linear time algorithm for constructing a cactus representation of minimum cuts", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:f068e3af9e7f16d67c8685ed1f2582a8", "modified":"no", "pages":"246--255", "date":"2009-01", "author":["Karger, David R.","Panigrahi, Debmalya"], "venue":"SODA", "publisher":"Society for Industrial and Applied Mathematics", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:Cactus", "year":"2009", "pdf":"soda09-cactus.pdf", "pub-type":"inproceedings", "booktitle":"SODA '09: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms", "location":"New York, New York", "address":"Philadelphia, PA, USA", "origin":"http://service.simile-widgets.org/babel/preview#A%20near-linear%20time%20algorithm%20for%20constructing%20a%20cactus%20representation%20of%20minimum%20cuts" }, {"id":"Crowdsourced Databases: Query Processing with People", "label":"Crowdsourced Databases: Query Processing with People", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d0d0a5805ee086809e9edffd751763c2", "modified":"no", "date":"2011-01", "author":["Marcus, Adam","We, Eugene","Karger, David R.","Madden, Sammuel","Miller, Robert C."], "venue":"CIDR", "month":"January", "cat":["CHI","Systems","Information Retrieval"], "key":"Karger:Qurk", "year":"2011", "pdf":"http://people.csail.mit.edu/marcua/papers/qurk-cidr2011.pdf", "pub-type":"inproceedings", "booktitle":"Conference on Innovation in Database Research (CIDR) 2011", "origin":"http://service.simile-widgets.org/babel/preview#Crowdsourced%20Databases%3A%20Query%20Processing%20with%20People" }, {"id":"Scatter/Gather as a Tool for Navigating Search Results", "label":"Scatter/Gather as a Tool for Navigating Search Results", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d49fa873e4c0f5185d9f3a060fecd2bf", "modified":"no", "abstract":" An important information access problem arises when the user is confronted with a very large number of documents that have been retrieved in response to a query. In this paper we explore the use of a technique, called Scatter/Gather, for the navigation of large collections of retrieved documents. Scatter/Gather clusters the documents into semantically coherent groups on-the-fly and presents descriptive summaries of the groups to the user. These groups can be used in several ways: to indentify useful subsets of documents to be perused with other tools, to eliminate subsets whose contents appear nonrelevant, or to select promising document subsets for reclustering into more refined groups. This paper describes the Scatter/Gather algorithm and illustrates its application to retrieval results via two examples.", "date":"1995", "ps":"ftp://parcftp.xerox.com/pub/hearst/knowlnav95.ps", "author":["Hearst, Marti A.","Karger, David R.","Pedersen, Jan O."], "venue":"AAAI", "cat":"Information Retrieval", "key":"Karger:ScatterResults", "year":"1995", "pub-type":"inproceedings", "booktitle":"Proceedings of the AAAI Fall Symposium on Knowledge Navigation", "origin":"http://service.simile-widgets.org/babel/preview#Scatter%2FGather%20as%20a%20Tool%20for%20Navigating%20Search%20Results" }, {"id":"Twitinfo: aggregating and visualizing microblogs for event exploration", "label":"Twitinfo: aggregating and visualizing microblogs for event exploration", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1aac5281265faf7dadfb4f0311275062", "modified":"no", "pages":"227--236", "date":"2011-05", "author":["Marcus, Adam","Bernstein, Michael S.","Badar, Osama","Karger, David R.","Madden, Samuel","Miller, Robert C."], "url":"http://twitinfo.csail.mit.edu/", "doi":"http://doi.acm.org/10.1145/1978942.1978975", "acmid":"1978975", "venue":"CHI", "publisher":"ACM", "month":"May", "cat":["CHI","Systems","Information Retrieval"], "series":"CHI '11", "key":"Karger:Twitinfo", "numpages":"10", "year":"2011", "isbn":"978-1-4503-0228-9", "pdf":"http://people.csail.mit.edu/marcua/papers/twitinfo-chi2011.pdf", "pub-type":"inproceedings", "keywords":"twitter visualization streaming aggregate sentiment", "booktitle":"Proceedings of the 2011 annual conference on Human factors in computing systems", "location":"Vancouver, BC, Canada", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Twitinfo%3A%20aggregating%20and%20visualizing%20microblogs%20for%20event%20exploration" }, {"id":"c930cfcaff1fb275b3a13e9e8fd24952", "label":"Haystack: Per-User Information Environments", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:c930cfcaff1fb275b3a13e9e8fd24952", "modified":"no", "abstract":"Haystack is a system that aims to maximize every individual user's control over the way he or she records, views, organizes, and searches for information. In this paper we discuss the elements of the system: a flexible semantic-net data model that can stretch to accomodate whatever information, relationships, properties, and categories a user considers important, and a user interface framework that can effecively display the personalized information space in ways that make sense to and can be customized by the end user.", "pages":"49-100", "date":"2007", "author":"Karger, David R.", "editor":"Victor Kaptelinin and Mary Czerwinski", "publisher":"The MIT Press", "cat":["Haystack","Systems","Information Retrieval","Semantic Web"], "chapter":"7", "key":"Karger:DesktopBook", "year":"2007", "pdf":"Papers/desktopchapter.pdf", "pub-type":"incollection", "booktitle":"Beyond the Desktop Metaphor: Designing Integrated Digital Work Environments", "address":"Cambridge, MA", "origin":"http://service.simile-widgets.org/babel/preview#c930cfcaff1fb275b3a13e9e8fd24952" }, {"id":"An {$\\Olog(n^2)$} Algorithm for Minimum Cuts", "label":"An {$\\Olog(n^2)$} Algorithm for Minimum Cuts", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:e9d3d2f131a3b735437f1eea0ec6d2f7", "modified":"no", "note":"Journal version appears in Journal of the ACM 43(4)", "crossref":"Proceedings of the {$25^{th}$} {ACM} Symposium on Theory of Computing", "place":"San Diego, CA", "abstract":" This paper present a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph's minimum cut form an extremely small fraction of the graph's edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability. The algorithm runs in O(n2log3n) time, a significant improvement over the previous ${\\tilde O}(mn)$ time bounds based on maximum flows. It is simple and intuitive and uses no complex data structures. Our algorithm can be parallelized to run in RNC with n2 processors; this gives the first proof that the minimum cut problem can be solved in RNC. The algorithm does more than find a single minimum cut; it finds all of them.With minor modifications, our algorithm solves two other problems of interest. Our algorithm finds all cuts with value within a multiplicative factor of &agr; of the minimum cut's in expected ${\\tilde O}(n^2)$ time, or in RNC with n2&agr; processors. The problem of finding a minimum multiway cut of graph into r pieces is solved in expected ${\\tilde O}(n^{2(r-1)})$ time, or in RNC with n2(r-1) processors. The ``trace'' of the algorithm's execution on these two problems forms a new compact data structure for representing all small cuts and all multiway cuts in a graph. This data structure can be efficiently transformed into the more standard cactus representing for minimum cuts. ", "pages":"757-765", "date":"1993-05", "ps":"http://people.csail.mit.edu/karger/Papers/fastcut.ps", "author":["Karger, David R.","Stein, Clifford"], "editor":"Alok Aggarwal", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:FastCut", "year":"1993", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$25^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#An%20%7B%24%5COlog(n%5E2)%24%7D%20Algorithm%20for%20Minimum%20Cuts" }, {"id":"Internet Surveillance of Pro-drug Websites. I. Incidence of Club Drug Reporting Over a One-Year Period.", "label":"Internet Surveillance of Pro-drug Websites. I. Incidence of Club Drug Reporting Over a One-Year Period.", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d9b6a5992445a8fc78bc11678df55032", "modified":"no", "note":"(abstract)", "pages":"536", "date":"2001", "author":["Boyer, Edward W.","Shih, Kai","Karger, David R.","Quang, L.","Case, P."], "volume":"39", "cat":"Information Retrieval", "journal":"Journal of Toxicology: Clinical Toxicology", "key":"Karger:Tox1", "year":"2001", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Internet%20Surveillance%20of%20Pro-drug%20Websites.%20I.%20Incidence%20of%20Club%20Drug%20Reporting%20Over%20a%20One-Year%20Period." }, {"id":"Better Random Sampling Algorithms for Flows in Undirected Graphs", "label":"Better Random Sampling Algorithms for Flows in Undirected Graphs", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:1fbdf7e84541d6c19a324381c8287f91", "modified":"no", "crossref":"Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"San Fransisco, CA", "abstract":"We present better random sampling algorithms for maximum flows in undirected graphs. Our algorithms apply to capacitated or uncapacitated graphs, and find a maximum flow of value v in ~ O( p mnv) time. This improves on a previous bound of ~ O(m 2=3 n 1=3 v) given by the author recently, which in turn improved on the O(mv) time bound for a typical augmenting path algorithm. In uncapacitated graphs without parallel edges, the bound is no worse than ~ O(n 5=2 ). We give another algorithm that finds a (1 \\Gamma ffl) times maximum flow in time ~ O(m p n=ffl), regardless of v. ", "pages":"490--499", "date":"1998-01", "ps":"http://people.csail.mit.edu/karger/Papers/flow2.ps", "author":"Karger, David R.", "editor":"Howard Karloff", "venue":"SODA", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:SmoothFlow", "year":"1998", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$9^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Better%20Random%20Sampling%20Algorithms%20for%20Flows%20in%20Undirected%20Graphs" }, {"id":"3ef548c744b529ad4ecb4f888cbd1720", "label":"An Experimental Study of Polylogarithmic Fully-Dynamic Connectivity Algorithms", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:3ef548c744b529ad4ecb4f888cbd1720", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"We present an experimental study of different variants of the amortized O(log2 n)-time fully-dynamic connectivity algorithm of Holm, de Lichtenberg, and Thorup (STOC'98). The experiments build upon experiments provided by Alberts, Cattaneo, and Italiano (SODA'96) on the randomized amortized O(log3 n) fully-dynamic connectivity algorithm of Henzinger and King (STOC'95). Our experiments shed light upon similarities and differences between the two algorithms. We also present a slightly modified version of the Henzinger-King algorithm that runs in O(log2 n) time, which resulted from our experiments.", "date":"2000-01", "ps":"http://people.csail.mit.edu/karger/Papers/impconn.ps", "author":["Iyer, Raj D.","Karger, David R.","Rahul, Hariharan","Thorup, Mikkel"], "venue":"ALENEX", "month":"January", "cat":"Theory", "key":"Karger:ImpConn-Conf", "year":"2000", "brag":"ALENEX00 special issue of the Journal of Experimental Algorithmics", "pub-type":"inproceedings", "booktitle":"Proceedings of ALENEX00: Workshop on Algorithm Engineering and Experimentation", "origin":"http://service.simile-widgets.org/babel/preview#3ef548c744b529ad4ecb4f888cbd1720" }, {"id":"a0c11590bc0560a7055bf3e1f9aa123a", "label":"Optimal Rounding Algorithms for a Geometric Embedding of the Multiway Cut Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a0c11590bc0560a7055bf3e1f9aa123a", "modified":"no", "crossref":"Proceedings of the {$30^{th}$} {ACM} Symposium on Theory of Computing", "place":"Philadelphia, PA", "abstract":"Given an undirected graph with edge costs and a subset of k 2 3 nodes called terminals, a multiway, or k-way, cut is a subset of the edges whose removal disconnects each terminal from the others. The multiway cut problem is to find a minimum-cost multiway cut. This problem is Max-SNP hard. Recently Calinescu, Karloff, end Rabbi (STOC'98) gave a novel geometric relaxation of the problem and a rounding scheme that produced a (312 - l/k)-approximation algorithm. In this paper. we study their geometric relaxation. In particular, we study the worst-case ratio between the value of the relaxation and the value of the minimum multicut (the so-called integrality gap of the relaxation). For k = 3, we show the integrality gap is 12/11. giving tight upper and lower bounds. That is, we exhibit a graph with integrality gap 12/11 and give an algorithm that finds a cut of value 12/11 times the relaxation value. This is the best possible perfom~ance guarantee for any algorithm based purely on the value of the relaxation and improves on Calinescu et al.'s factor of 716. We also improve the upper hounds for all larger values of k. Fork = 4,5, our best upper bounds are based on computer constructed and analyzed rounding schemes, while fork > 6 we give an algorithm with performance ratio 1.3438 - a. Our results were discovered with the help of computational experiments that we also describe here. ", "pages":"668--677", "date":"1999-05", "ps":"http://people.csail.mit.edu/karger/Papers/kcut.ps", "author":["Karger, David R.","Klein, Philip N.","Stein, Clifford","Thorup, Mikkel","Young, Neal"], "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":"Theory", "key":"Karger:Multicut", "year":"1999", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$30^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#a0c11590bc0560a7055bf3e1f9aa123a" }, {"id":"dc6a03fc2bc93b438bd1298340ac40f3", "label":"On Approximating the Longest Path in a Graph", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:dc6a03fc2bc93b438bd1298340ac40f3", "modified":"no", "note":"Journal version appears in Algorithmica 18(1)", "abstract":"We consider the problem of approximating the longest path in undirected graphs. In an attempt to pin down the best achievable performance ratio of an approximation algorithm for this problem, we present both positive and negative results. First, a simple greedy algorithm is shown to find long paths in dense graphs. We then consider the problem of finding paths in graphs that are guaranteed to have extremely long paths. We devise an algorithm that finds paths of a logarithmic length in Hamiltonian graphs. This algorithm works for a much larger class of graphs (weakly Hamiltonian), where the result is the best possible. Since the hard case appears to be that of sparse graphs, we also consider sparse random graphs. Here we show that a relatively long path can be obtained, thereby partially answering an open problem of Broderet al. To explain the difficulty of obtaining better approximations, we also prove hardness results. We show that, for any e<1, the problem of finding a path of lengthn-n e in ann-vertex Hamiltonian graph isNP-hard. We then show that no polynomial-time algorithm can find a constant factor approximation to the longest-path problem unlessP=NP. We conjecture that the result can be strengthened to say that, for some constant d>0, finding an approximation of ration d is alsoNP-hard. As evidence toward this conjecture, we show that if any polynomial-time algorithm can approximate the longest path to a ratio of $$2^{O(\\log ^{1 - \\varepsilon } n)} $$ , for any e>0, thenNP has a quasi-polynomial deterministic time simulation. The hardness results apply even to the special case where the input consists of bounded degree graphs. ", "pages":"421--430", "date":"1993-08", "author":["Karger, David R.","Ramkumar, G. D. S.","Motwani, Rajeev"], "editor":"Frank Dehne", "venue":"WADS", "publisher":"Springer-Verlag", "month":"August", "cat":"Theory", "series":"Lecture Notes in Computer Science", "key":"Karger:Hamilton-Conf", "year":"1993", "pub-type":"inproceedings", "booktitle":"WADS93: Algorithms and Data Structures : Third Workshop", "number":"709", "origin":"http://service.simile-widgets.org/babel/preview#dc6a03fc2bc93b438bd1298340ac40f3" }, {"id":"Global Models of Document Structure Using Latent Permutations", "label":"Global Models of Document Structure Using Latent Permutations", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:7f5301459b108530d23281acd9e2df4f", "modified":"no", "date":"2009-05", "author":["Chen, Harr","Branavan, S.R.K.","Barzilay, Regina","Karger, David R."], "venue":"NAACL", "month":"May", "cat":["Machine Learning","Information Retrieval"], "key":"Karger:Mallows", "year":"2009", "pdf":"http://people.csail.mit.edu/regina/my_papers/perm.pdf", "pub-type":"inproceedings", "booktitle":"North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT) conference", "address":"Boulder, CO, USA", "origin":"http://service.simile-widgets.org/babel/preview#Global%20Models%20of%20Document%20Structure%20Using%20Latent%20Permutations" }, {"id":"8500b20662f85c8b510dd5665064c75b", "label":"Finding the Hidden Path: Time Bounds for All-Pairs Shortest Paths", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8500b20662f85c8b510dd5665064c75b", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$32^{nd}$} Annual Symposium on the Foundations of Computer Science", "crossref":"a43beb86a4092fe9f25ee627b9843bbe", "abstract":"The all-pairs shortest-paths problem in weighted graphs is investigated. An algorithm---the Hidden-Paths Algorithm---that finds these paths in time $O(m^ * n + n^2 \\log n)$, where $m^ * $ is the number of edges participating in shortest paths, is presented. The algorithm is a practical substitute for Dijkstra's algorithm. It is argued that $m^ * $ is likely to be small in practice since $m^ * = O(n\\log n)$ with high probability for many probability distributions on edge weights. An $\\Omega (mn)$ lower bound on the running time of any path-comparison-based algorithm for the all-pairs shortest-paths problem is also proved. Path-comparison-based algorithms form a natural class containing the Hidden-Paths Algorithm, as well as the algorithms of E. W. Dijkstra [Numer. Math., 1 (1959), pp. 269--271] and R. W. Floyd [Comm. ACM, 5 (1962), p. 345]. Lastly, generalized forms of the shortest-paths problem are considered, and it is shown that many of the standard shortest-paths algorithms are effective in this more general setting.", "pages":"1199--1217", "date":"1993-12", "ps":"http://people.csail.mit.edu/karger/Papers/path.journal.ps", "author":["Karger, David R.","Koller, Daphne","Phillips, Steven J."], "volume":"22", "month":"December", "cat":"Theory", "journal":"SIAM Journal on Computing", "key":"Karger:Paths", "year":"1993", "pub-type":"article", "number":"6", "origin":"http://service.simile-widgets.org/babel/preview#8500b20662f85c8b510dd5665064c75b" }, {"id":"Global Min-cuts in {$\\RNC$} and Other Ramifications of a Simple Mincut Algorithm", "label":"Global Min-cuts in {$\\RNC$} and Other Ramifications of a Simple Mincut Algorithm", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:073eeb13cb628cce2a7e45babdd6de7b", "modified":"no", "note":"This work was merged with later work into Journal of the ACM43(4)", "crossref":"Proceedings of the {$4^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Austin, TX", "abstract":"This paper presents a new algorithm for finding global min-cuts in weighted, undirected graphs. One of the strengths of the algorithm is its extreme simplicity. This randomized algorithm can be implemented as a strongly polynomial sequential algorithm with running time 6(mn2), even if space is restricted to O(n), or can be parallelized as an Zn/C algorithm which runs in time O(log2 n) on a CRCW PRAM with mn2 log n processors. In addition to yielding the best known processor bounds on unweighted graphs, this algorithm provides the first proof that the min-cut problem for weighted undirected graphs is in 7ZAfC. The algorithm does more than find a single mm-cut; it finds all of them. The algorithm also yields numerous results on network reliability, enumeration of cuts, multi-way cuts, and approximate mm-cuts. 1", "pages":"21--30", "date":"1993-01", "ps":"Papers/mincut.ps", "author":"Karger, David R.", "venue":"SODA", "month":"January", "cat":["Theory","Cuts and Flows"], "key":"Karger:Mincut", "year":"1993", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$4^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "organization":"ACM-SIAM", "origin":"http://service.simile-widgets.org/babel/preview#Global%20Min-cuts%20in%20%7B%24%5CRNC%24%7D%20and%20Other%20Ramifications%20of%20a%20Simple%20Mincut%20Algorithm" }, {"id":"cc012100621c64fe216f5c6a29578143", "label":"Job Scheduling in Rings", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:cc012100621c64fe216f5c6a29578143", "modified":"no", "note":"Journal version appears in Journal of Parallel and Distributed Computing 45(2)", "crossref":"Proceedings of the {$6^{th}$} Annual {ACM}-{SIAM} Symposium on Parallel Algorithms and Architectures", "abstract":"We give a distributed approximation algorithm for job scheduling in a ring architecture. In contrast to many other parallel scheduling models, the model we consider captures the influence of the underlying communications network by specifying that task migration from one processor to another takes time proportional to the distance between those two processors in the network. As a result, our algorithm must balance computational load and communication time. The algorithm is simple, requires no global control, and yields schedules of length at most 4.22 times optimal. We also give a lower bound on the performance of any distributed algorithm and the results of simulation experiments which suggest better performance than does our worst-case analysis.", "pages":"210--219", "date":"1994-06", "ps":"Papers/ring.ps", "author":["Fizzano, Perry","Karger, David R.","Stein, Cliff","Wein, Joel"], "venue":"SPAA", "month":"June", "cat":"Theory", "key":"Karger:Ring-Conf", "year":"1994", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$6^{th}$} Annual {ACM}-{SIAM} Symposium on Parallel Algorithms and Architectures", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#cc012100621c64fe216f5c6a29578143" }, {"id":"Analytic Methods for Optimizing Realtime Crowdsourcing", "label":"Analytic Methods for Optimizing Realtime Crowdsourcing", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:39139efaaab8deddca96fe9334b8e2c0", "modified":"no", "date":"2012-04", "author":["Bernstein, Michael S.","Karger, David R.","Miller, Robert C.","Brandt, Joel"], "url":"http://arxiv.org/abs/1204.2995", "month":"April", "cat":["Applications of Theory","Theory","Crowdsourcing"], "key":"Karger:CrowdQueuing", "year":"2012", "pdf":"http://arxiv.org/pdf/1204.2995v1", "pub-type":"inproceedings", "booktitle":"Collective Intelligence", "origin":"http://service.simile-widgets.org/babel/preview#Analytic%20Methods%20for%20Optimizing%20Realtime%20Crowdsourcing" }, {"id":"a43beb86a4092fe9f25ee627b9843bbe", "label":"Finding the Hidden Path: Time Bounds for All-Pairs Shortest Paths", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:a43beb86a4092fe9f25ee627b9843bbe", "modified":"no", "note":"A preliminary version appeared in Proceedings of the {$32^{nd}$} Annual Symposium on the Foundations of Computer Science", "abstract":"The all-pairs shortest-paths problem in weighted graphs is investigated. An algorithm---the Hidden-Paths Algorithm---that finds these paths in time $O(m^ * n + n^2 \\log n)$, where $m^ * $ is the number of edges participating in shortest paths, is presented. The algorithm is a practical substitute for Dijkstra's algorithm. It is argued that $m^ * $ is likely to be small in practice since $m^ * = O(n\\log n)$ with high probability for many probability distributions on edge weights. An $\\Omega (mn)$ lower bound on the running time of any path-comparison-based algorithm for the all-pairs shortest-paths problem is also proved. Path-comparison-based algorithms form a natural class containing the Hidden-Paths Algorithm, as well as the algorithms of E. W. Dijkstra [Numer. Math., 1 (1959), pp. 269--271] and R. W. Floyd [Comm. ACM, 5 (1962), p. 345]. Lastly, generalized forms of the shortest-paths problem are considered, and it is shown that many of the standard shortest-paths algorithms are effective in this more general setting.", "pages":"1199--1217", "date":"1993-12", "ps":"http://people.csail.mit.edu/karger/Papers/path.journal.ps", "author":["Karger, David R.","Koller, Daphne","Phillips, Steven J."], "volume":"22", "month":"December", "cat":"Theory", "journal":"SIAM Journal on Computing", "key":"Karger:Paths-Journal", "year":"1993", "pub-type":"article", "number":"6", "origin":"http://service.simile-widgets.org/babel/preview#a43beb86a4092fe9f25ee627b9843bbe" }, {"id":"Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications", "label":"Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:ad52536e3e8999b45232661e18387141", "modified":"no", "studentwork":"\\asteriskit{\\labelwidth} ", "abstract":"A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable: communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes. ", "date":"2003-02", "author":["Stoica, Ion","Morris, Robert","Liben-Nowell, David","Karger, David R.","Kaashoek, M. Frans","Dabek, Frank","Balakrishnan, Hari"], "volume":"11", "month":"February", "cat":["Theory","Applications of Theory","P2P","Systems"], "journal":"IEEE Transactions on Networking", "key":"Karger:Chord", "year":"2003", "pdf":"http://www.pdos.csail.mit.edu/papers/ton:chord/paper-ton.pdf", "pub-type":"article", "origin":"http://service.simile-widgets.org/babel/preview#Chord%3A%20A%20Scalable%20Peer-to-Peer%20Lookup%20Protocol%20for%20Internet%20Applications" }, {"id":"Expressive Query Construction Through Direct Manipulation of Nested Relational Results", "label":"Expressive Query Construction Through Direct Manipulation of Nested Relational Results", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:58fa05f208ba56dce0d506559246bd2c", "modified":"no", "pages":"1377--1392", "date":"2016-07", "author":["Bakke, Eirik","Karger, David R."], "doi":"http://doi.acm.org/10.1145/2882903.2915210", "acmid":"2915210", "publisher":"ACM", "doin":"10.1145/2882903.2915210", "month":"July", "cat":["CHI","Databases","Systems"], "series":"SIGMOD '16", "key":"Karger:Sieuferd", "numpages":"16", "year":"2016", "isbn":"978-1-4503-3531-7", "pdf":"Papers/sieuferd.pdf", "pub-type":"inproceedings", "keywords":["direct manipulation","hierarchical data models","nested relations","report generation","spreadsheet interfaces","user studies","visual query languages","visual query systems"], "booktitle":"Proceedings of the 2016 International Conference on Management of Data", "location":"San Francisco, California, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Expressive%20Query%20Construction%20Through%20Direct%20Manipulation%20of%20Nested%20Relational%20Results", "abstract": "Despite extensive research on visual query systems, the standard way to interact with relational databases remains to be through SQL queries and tailored form interfaces. We consider three requirements to be essential to a successful alternative: (1) query specification through direct manipulation of results, (2) the ability to view and modify any part of the current query without departing from the direct manipulation interface, and (3) SQL-like expressiveness. This paper presents the first visual query system to meet all three requirements in a single design. By directly manipulating nested relational results, and using spreadsheet idioms such as formulas and filters, the user can express a relationally complete set of query operators plus calculation, aggregation, outer joins, sorting, and nesting, while always remaining able to track and modify the state of the complete query. Our prototype gives the user an experience of responsive, incremental query building while pushing all actual query processing to the database layer. We evaluate our system with formative and controlled user studies on 28 spreadsheet users; the controlled study shows our system significantly outperforming Microsoft Access on the System Usability Scale." }, {"id":"Spreadsheet Driven Web Applications", "label":"Spreadsheet Driven Web Applications", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:4c03bacff2fd81603a4aee7d5ac63b69", "modified":"no", "pages":"97--106", "date":"2014-10", "author":["Benson, Edward","Zhang, Amy X.","Karger, David R."], "url":"http://doi.acm.org/10.1145/2642918.2647387", "doi":"10.1145/2642918.2647387", "acmid":"2647387", "venue":"CHI", "publisher":"ACM", "month":"October", "cat":["CHI","Databases","Systems","Visualization"], "series":"UIST '14", "key":"Karger:SpreadsheetApps", "numpages":"10", "year":"2014", "isbn":"978-1-4503-3069-5", "pdf":"http://edwardbenson.com/papers/uist2014-spreadsheet-driven-web-apps.pdf", "pub-type":"inproceedings", "keywords":["end-user programming","information architecture","spreadsheets","web design"], "booktitle":"Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology", "location":"Honolulu, Hawaii, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#Spreadsheet%20Driven%20Web%20Applications" }, {"id":"On the Expected VCG Overpayment in Large Networks", "label":"On the Expected VCG Overpayment in Large Networks", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:d4175a7302d56c0532d686b72aa6a169", "modified":"no", "pages":"2831--2836", "date":"2006-12", "author":["Nikolova, Evdokia","Karger, David R."], "doi":"http://dx.doi.org/10.1109/CDC.2006.377149", "venue":"DC", "month":"December", "cat":"Theory", "key":"Karger:VCGOverpayment", "year":"2006", "pdf":"http://people.csail.mit.edu/enikolova/papers/cdc-6pages.pdf", "pub-type":"inproceedings", "booktitle":"45th IEEE Conference on Decision and Control", "origin":"http://service.simile-widgets.org/babel/preview#On%20the%20Expected%20VCG%20Overpayment%20in%20Large%20Networks" }, {"id":"Enabling Web Browsers to Augment Web Sites' Filtering and Sorting Functionalities", "label":"Enabling Web Browsers to Augment Web Sites' Filtering and Sorting Functionalities", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:15430ad36fa4554096c22a8448eb1ca2", "modified":"no", "abstract":"Existing augmentations of web pages are mostly small cosmetic changes (e.g., removing ads) and minor addition of third-party content (e.g., product prices from competing sites). None leverages the structured data presented in web pages. This paper describes Sifter, a web browser extension that can augment a web site with advanced filtering and sorting functionality. These added features work inside the site's own pages, preserving the site's presentational style, as if the site itself has implemented the features. Sifter contains an algorithm that scrapes structured data out of web pages while usually requiring no user intervention. We tested Sifter on real web sites and real users and found that people could use Sifter to perform sophisticated queries and high-level analyses on sizable data collections on the Web. We propose that web sites can be similarly augmented with other sophisticated data-centric functionality, giving users new benefits over the existing Web.", "pages":"125--134", "date":"2006-05", "author":["Huynh, David F.","Miller, Robert C.","Karger, David R."], "pdfkb":"1400", "venue":"UIST", "month":"May", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Sifter", "year":"2006", "pdf":"http://people.csail.mit.edu/dfhuynh/research/papers/uist2006-augmenting-web-sites.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the ACM Symposium on User Interface Software and Technology (UIST)", "origin":"http://service.simile-widgets.org/babel/preview#Enabling%20Web%20Browsers%20to%20Augment%20Web%20Sites'%20Filtering%20and%20Sorting%20Functionalities" }, {"id":"Cascading tree sheets and recombinant HTML: better encapsulation and retargeting of web content", "label":"Cascading tree sheets and recombinant HTML: better encapsulation and retargeting of web content", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:87ff9ae3597598780e990855c991267b", "modified":"no", "pages":"107--118", "date":"2013-05", "author":["Benson, Edward O.","Karger, David R."], "data":"2013-05", "url":"http://dl.acm.org/citation.cfm?id=2488388.2488399", "acmid":"2488399", "publisher":"International World Wide Web Conferences Steering Committee", "month":"May", "cat":["CHI","Visualization"], "series":"WWW '13", "key":"Karger:TreeSheets", "numpages":"12", "year":"2013", "isbn":"978-1-4503-2035-1", "pdf":"http://edwardbenson.com/papers/www2013-cascading-tree-sheets.pdf", "pub-type":"inproceedings", "keywords":["css","cts","html","tree sheets","web authoring","xslt"], "booktitle":"Proceedings of the 22nd international conference on World Wide Web", "location":"Rio de Janeiro, Brazil", "address":"Republic and Canton of Geneva, Switzerland", "origin":"http://service.simile-widgets.org/babel/preview#Cascading%20tree%20sheets%20and%20recombinant%20HTML%3A%20better%20encapsulation%20and%20retargeting%20of%20web%20content" }, {"id":"Adopting a Common Data Model for End User Web Programming Tools", "label":"Adopting a Common Data Model for End User Web Programming Tools", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:8afe9e9cf920a359c3d5272ca93a16b4", "modified":"no", "date":"2009-04", "author":["Karger, David R.","Huynh, David"], "month":"April", "cat":["CHI","Databases"], "key":"Karger:DataModelEUP", "year":"2009", "pub-type":"inproceedings", "booktitle":"CHI2009 Workshop on End User Programming for the Web", "origin":"http://service.simile-widgets.org/babel/preview#Adopting%20a%20Common%20Data%20Model%20for%20End%20User%20Web%20Programming%20Tools" }, {"id":"Route Planning under Uncertainty: The Canadian Traveller Problem", "label":"Route Planning under Uncertainty: The Canadian Traveller Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:c4a0cf8299a1ad382be097885ae36d95", "modified":"no", "crossref":"Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008", "pages":"969-974", "date":"2008-07", "author":["Nikolova, Evdokia","Karger, David R."], "bibsource":"DBLP, http://dblp.uni-trier.de", "editor":"Dieter Fox and Carla P. Gomes", "venue":"AAAI", "publisher":"AAAI Press", "month":"July", "cat":"Theory", "key":"Karger:Canadian", "year":"2008", "isbn":"978-1-57735-368-3", "pub-type":"inproceedings", "booktitle":"Twenty-Third AAAI Conference on Artificial Intelligence", "origin":"http://service.simile-widgets.org/babel/preview#Route%20Planning%20under%20Uncertainty%3A%20The%20Canadian%20Traveller%20Problem" }, {"id":"The Complexity of Matrix Completion", "label":"The Complexity of Matrix Completion", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:0b71ec125caeb896d362b6eaf5a6ffd9", "modified":"no", "crossref":"Proceedings of the {$17^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "place":"Miami, FL", "pages":"1103--1111", "date":"2006-01", "author":["Harvey, Nicholas J. A.","Karger, David R.","Yekhanin, Sergey"], "doi":"http://doi.acm.org/10.1145/1109557.1109679", "abstsract":"Given a matrix whose entries are a mixture of numeric values and symbolic variables, the matrix completion problem is to assign values to the variables so as to maximize the resulting matrix rank. This problem has deep connections to computational complexity and numerous important algorithmic applications. Determining the complexity of this problem is a fundamental open question in computational complexity. Under different settings of parameters, the problem is variously in P, in RP, or NP-hard. We shed new light on this landscape by demonstrating a new region of NP-hard scenarios. As a special case, we obtain the first known hardness result for matrices in which each variable appears only twice.Another particular scenario that we consider is the simultaneous matrix completion problem, where one must simultaneously maximize the rank for several matrices that share variables. This problem has important applications in the field of network coding. Recent work has given a simple, greedy, deterministic algorithm for this problem, assuming that the algorithm works over a sufficiently large field. We show an exact threshold for the field size required to find a simultaneous completion efficiently. This result implies that, surprisingly, the simple greedy algorithm is optimal: finding a simultaneous completion over any smaller field is NP-hard.", "venue":"SODA", "publisher":"ACM Press", "month":"January", "cat":["Theory","Coding"], "key":"Karger:MatrixCompletion", "year":"2006", "isbn":"0-89871-605-5", "pdf":"http://math.ias.edu/~yekhanin/Papers/Soda06.pdf", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$17^{th}$} Annual {ACM}-{SIAM} Symposium on Discrete Algorithms", "location":"Miami, Florida", "organization":"ACM-SIAM", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#The%20Complexity%20of%20Matrix%20Completion" }, {"id":"Approximating, Verifying, and Constructing Minimum Spanning Forests", "label":"Approximating, Verifying, and Constructing Minimum Spanning Forests", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:5962d2e12e692905cce3b20742ccb956", "modified":"no", "note":"Manuscript.", "date":"1992", "author":"Karger, David R.", "cat":"Theory", "key":"Karger:MST-manuscript", "year":"1992", "pub-type":"unpublished", "origin":"http://service.simile-widgets.org/babel/preview#Approximating%2C%20Verifying%2C%20and%20Constructing%20Minimum%20Spanning%20Forests" }, {"id":"{GUI} --- Phooey!: The Case for Text Input", "label":"{GUI} --- Phooey!: The Case for Text Input", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:6226dec68bb6920a076b3085a445f465", "modified":"no", "pages":"193--202", "date":"2007-10", "author":["Van Kleek, Max","Bernstein, Michael","Karger, David R.","schraefel, mc"], "doi":"http://doi.acm.org/10.1145/1294211.1294247", "video":"http://people.csail.mit.edu/msbernst/videos/uist07-jourknow.mov", "venue":"UIST", "publisher":"ACM Press", "month":"October", "cat":["Information Retrieval","Semantic Web","CHI","Haystack"], "key":"Karger:Phooey", "year":"2007", "isbn":"978-1-59593-679-2", "pdf":"http://people.csail.mit.edu/msbernst/papers/p337-vankleek.pdf", "pub-type":"inproceedings", "booktitle":"UIST '07: Proceedings of the 20th annual ACM symposium on User interface software and technology", "location":"Newport, Rhode Island, USA", "address":"New York, NY, USA", "origin":"http://service.simile-widgets.org/babel/preview#%7BGUI%7D%20---%20Phooey!%3A%20The%20Case%20for%20Text%20Input" }, {"id":"Talking about Data: Sharing Richly Structured Information through Blogs and Wikis", "label":"Talking about Data: Sharing Richly Structured Information through Blogs and Wikis", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:340437e28727a261036a36c7c289431c", "modified":"no", "pages":"48--63", "date":"2010-11", "author":["Benson, Edward","Marcus, Adam","Howahl, Fabian","Karger, David R."], "url":"http://projects.csail.mit.edu/datapress", "doi":"http://dx.doi.org/10.1007/978-3-642-17746-0_4", "bibsource":"DBLP, http://dblp.uni-trier.de", "venue":"ISWC", "month":"November", "cat":["CHI","Visualization","Databases"], "key":"Karger:Datapress", "year":"2010", "pdf":"Papers/iswc2010-datapress.pdf", "pub-type":"inproceedings", "booktitle":"International Semantic Web Conference", "location":"Shanghai, China", "origin":"http://service.simile-widgets.org/babel/preview#Talking%20about%20Data%3A%20Sharing%20Richly%20Structured%20Information%20through%20Blogs%20and%20Wikis" }, {"id":"83624c7ccfd27668f44b73b2fa8c424c", "label":"A Randomized Fully Polynomial Approximation Scheme for the All Terminal Network Reliability Problem", "type":"Publication", "uri":"http://service.simile-widgets.org/babel/urn:83624c7ccfd27668f44b73b2fa8c424c", "modified":"no", "note":"Journal version appears in SIAM Journal on Computing 29(2)", "crossref":"Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "place":"Las Vegas, NV", "abstract":"The classic all-terminal network reliability problem posits a graph, each of whose edges fails independently with some given probability. The goal is to determine the probability that the network becomes disconnected due to edge failures. This problem has obvious applications in the design of communication networks. Since the problem is $\\SP$-complete and thus believed hard to solve exactly, a great deal of research has been devoted to estimating the failure probability. In this paper, we give a fully polynomial randomized approximation scheme that, given any n-vertex graph with specified failure probabilities, computes in time polynomial in n and $1/\\epsilon$ an estimate for the failure probability that is accurate to within a relative error of $1\\pm\\epsilon$ with high probability. We also give a deterministic polynomial approximation scheme for the case of small failure probabilities. Some extensions to evaluating probabilities of k-connectivity, strong connectivity in directed Eulerian graphs and r-way disconnection, and to evaluating the Tutte polynomial are also described.", "pages":"11--17", "date":"1995-05", "ps":"http://people.csail.mit.edu/karger/Papers/reliability.ps", "author":"Karger, David R.", "venue":"STOC", "publisher":"ACM Press", "month":"May", "cat":["Theory","Cuts and Flows"], "key":"Karger:Reliability-Conf", "year":"1995", "pub-type":"inproceedings", "booktitle":"Proceedings of the {$27^{th}$} {ACM} Symposium on Theory of Computing", "organization":"ACM", "origin":"http://service.simile-widgets.org/babel/preview#83624c7ccfd27668f44b73b2fa8c424c" } ] }