14. KDD 2008:
Las Vegas,
Nevada,
USA
Ying Li, Bing Liu, Sunita Sarawagi (Eds.):
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008.
ACM 2008, ISBN 978-1-60558-193-4
Research papers
- Aris Anagnostopoulos, Ravi Kumar, Mohammad Mahdian:
Influence and correlation in social networks.
7-15
- Luca Becchetti, Paolo Boldi, Carlos Castillo, Aristides Gionis:
Efficient semi-streaming algorithms for local triangle counting in massive graphs.
16-24
- Indrajit Bhattacharya, Shantanu Godbole, Sachindra Joshi:
Structured entity identification and document categorization: two tasks with one joint model.
25-33
- Albert Bifet, Ricard Gavaldà:
Mining adaptively frequent closed unlabeled rooted trees in data streams.
34-42
- Mustafa Bilgic, Lise Getoor:
Effective label acquisition for collective classification.
43-51
- Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis:
Topical query decomposition.
52-60
- Christos Boutsidis, Michael W. Mahoney, Petros Drineas:
Unsupervised feature selection for principal components analysis.
61-69
- Justin Brickell, Vitaly Shmatikov:
The cost of privacy: destruction of data-mining utility in anonymized data publishing.
70-78
- Deepayan Chakrabarti, Ravi Kumar, Kunal Punera:
Generating succinct titles for web URLs.
79-87
- Soumen Chakrabarti, Rajiv Khanna, Uma Sawant, Chiru Bhattacharyya:
Structured learning for non-smooth ranking losses.
88-96
- Ming-wei Chang, Wen-tau Yih, Christopher Meek:
Partitioned logistic regression for spam filtering.
97-105
- Jianhui Chen, Shuiwang Ji, Betul Ceran, Qi Li, Mingrui Wu, Jieping Ye:
Learning subspace kernels for classification.
106-114
- WenYen Chen, Dong Zhang, Edward Y. Chang:
Combinational collaborative filtering for personalized community recommendation.
115-123
- Xue-wen Chen, Michael Wasikowski:
FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems.
124-132
- Haibin Cheng, Pang-Ning Tan:
Semi-supervised learning with data calibration for long-term time series forecasting.
133-141
- Yong Ju Cho, Naren Ramakrishnan, Yang Cao:
Reconstructing chemical reaction networks: data mining meets system identification.
142-150
- Peter Christen:
Automatic record linkage using seeded nearest neighbour and support vector machine classification.
151-159
- David J. Crandall, Dan Cosley, Daniel P. Huttenlocher, Jon M. Kleinberg, Siddharth Suri:
Feedback effects between similarity and social influence in online communities.
160-168
- Kaustav Das, Jeff G. Schneider, Daniel B. Neill:
Anomaly pattern detection in categorical datasets.
169-176
- Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong:
Bypass rates: reducing query abandonment using negative inferences.
177-185
- Anirban Dasgupta, Ravi Kumar, Amit Sasturkar:
De-duping URLs via rewrite rules.
186-194
- Jason V. Davis, Inderjit S. Dhillon:
Structured metric learning for high dimensional problems.
195-203
- Luc De Raedt, Tias Guns, Siegfried Nijssen:
Constraint programming for itemset mining.
204-212
- Charles Elkan, Keith Noto:
Learning classifiers from only positive and unlabeled data.
213-220
- Kave Eshghi, Shyamsundar Rajaram:
Locality sensitive hash functions based on concomitant rank order statistics.
221-229
- Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu, Olivier Verscheure:
Direct mining of discriminative and essential frequent patterns via model-based search tree.
230-238
- George Forman, Shyamsundar Rajaram:
Scaling up text classification for large file systems.
239-246
- Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamuro:
SPIRAL: efficient and exact model identification for hidden Markov models.
247-255
- Brian Gallagher, Hanghang Tong, Tina Eliassi-Rad, Christos Faloutsos:
Using ghost edges for classification in sparsely labeled networks.
256-264
- Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, Adam Smith:
Composition attacks and auxiliary information in data privacy.
265-273
- Venkatesh Ganti, Arnd Christian König, Rares Vernica:
Entity categorization over large document collections.
274-282
- Jing Gao, Wei Fan, Jing Jiang, Jiawei Han:
Knowledge transfer via multiple model local structure mapping.
283-291
- Gemma C. Garriga, Esa Junttila, Heikki Mannila:
Banded structure in binary matrices.
292-300
- Rohit Gupta, Gang Fang, Blayne Field, Michael Steinbach, Vipin Kumar:
Quantitative evaluation of approximate frequent pattern mining algorithms.
301-309
- Robert Hall, Charles A. Sutton, Andrew McCallum:
Unsupervised deduplication using cross-field dependencies.
310-317
- Meng Hu, Jiong Yang, Wei Su:
Permu-pattern: discovery of mutable permutation patterns with proximity constraint.
318-326
- Heng Huang, Chris H. Q. Ding, Dijun Luo, Tao Li:
Simultaneous tensor subspace selection and clustering: the equivalence of high order svd and k-means clustering.
327-335
- Woochang Hwang, Taehyong Kim, Murali Ramanathan, Aidong Zhang:
Bridging centrality: graph mining from element level to group level.
336-344
- Saara Hyvönen, Pauli Miettinen, Evimaria Terzi:
Interpretable nonnegative matrix decompositions.
345-353
- Georgiana Ifrim, Gökhan H. Bakir, Gerhard Weikum:
Fast logistic regression for text categorization with variable-length n-grams.
354-362
- Tomoharu Iwata, Takeshi Yamada, Naonori Ueda:
Probabilistic latent semantic visualization: topic model for visualizing documents.
363-371
- David D. Jensen, Andrew S. Fast, Brian J. Taylor, Marc E. Maier:
Automatic identification of quasi-experimental designs for discovering causal knowledge.
372-380
- Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye:
Extracting shared subspace for multi-label classification.
381-389
- Bin Jiang, Jian Pei, Xuemin Lin, David W. Cheung, Jiawei Han:
Mining preferences from superior and inferior examples.
390-398
- Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan:
Effective and efficient itemset pattern summarization: regression-based approaches.
399-407
- S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin:
A sequential dual method for large scale multi-class linear svms.
408-416
- Jerry Kiernan, Evimaria Terzi:
Constructing comprehensive summaries of large event sequences.
417-425
- Yehuda Koren:
Factorization meets the neighborhood: a multifaceted collaborative filtering model.
426-434
- Gueorgi Kossinets, Jon M. Kleinberg, Duncan J. Watts:
The structure of information pathways in a social communication network.
435-443
- Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek:
Angle-based outlier detection in high-dimensional data.
444-452
- Srivatsan Laxman, Vikram Tankasali, Ryen W. White:
Stream prediction using a generative model based on frequent episodes in event sequences.
453-461
- Jure Leskovec, Lars Backstrom, Ravi Kumar, Andrew Tomkins:
Microscopic evolution of social networks.
462-470
- Lei Li, Wenjie Fu, Fan Guo, Todd C. Mowry, Christos Faloutsos:
Cut-and-stitch: efficient parallel learning of linear dynamical systems on smps.
471-479
- Charles X. Ling, Jun Du:
Active learning with direct query construction.
480-487
- Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu:
Spectral domain-transfer learning.
488-496
- Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce R. Schatz:
Mining multi-faceted overviews of arbitrary topics in a text collection.
497-505
- Aurelie C. Lozano, Naoki Abe:
Multi-class cost-sensitive boosting with p-norm loss functions.
506-514
- Omid Madani, Jian Huang:
On updates that constrain the features' connections during learning.
515-523
- Mary McGlohon, Leman Akoglu, Christos Faloutsos:
Weighted graphs and disconnected components: patterns and a generator.
524-532
- Gabriela Moise, Jörg Sander:
Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering.
533-541
- Ramesh Nallapati, Amr Ahmed, Eric P. Xing, William W. Cohen:
Joint latent topic models for text and citations.
542-550
- Nam Nguyen, Rich Caruana:
Classification with partial labels.
551-559
- Dino Pedreschi, Salvatore Ruggieri, Franco Turini:
Discrimination-aware data mining.
560-568
- Ian Porteous, David Newman, Alexander T. Ihler, Arthur Asuncion, Padhraic Smyth, Max Welling:
Fast collapsed gibbs sampling for latent dirichlet allocation.
569-577
- Hiroto Saigo, Nicole Krämer, Koji Tsuda:
Partial least squares regression for graph mining.
578-586
- Issei Sato, Minoru Yoshida, Hiroshi Nakagawa:
Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model.
587-595
- Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec:
Mobile call graphs: beyond power-law and lognormal distributions.
596-604
- Qihong Shao, Yi Chen, Shu Tao, Xifeng Yan, Nikos Anerousis:
Efficient ticket routing by resolution sequence mining.
605-613
- Victor S. Sheng, Foster J. Provost, Panagiotis G. Ipeirotis:
Get another label? improving data quality and data mining using multiple, noisy labelers.
614-622
- Jin Shieh, Eamonn J. Keogh:
iSAX: indexing and mining terabyte sized time series.
623-631
- Ka Cheung Sia, Junghoo Cho, Yun Chi, Belle L. Tseng:
Efficient computation of personal aggregate queries on blogs.
632-640
- György J. Simon, Vipin Kumar, Zhi-Li Zhang:
Semi-supervised approach to rapid and reliable labeling of large data sets.
641-649
- Ajit Paul Singh, Geoffrey J. Gordon:
Relational learning via collective matrix factorization.
650-658
- Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums:
A bayesian mixture model with linear regression mixing proportions.
659-667
- Liang Sun, Shuiwang Ji, Jieping Ye:
Hypergraph spectral learning for multi-label classification.
668-676
- Lei Tang, Huan Liu, Jianping Zhang, Zohreh Nazeri:
Community evolution in dynamic multi-mode networks.
677-685
- Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos:
Colibri: fast mining of large static and dynamic graphs.
686-694
- Pedro O. S. Vaz de Melo, Virgílio A. F. Almeida, Antonio Alfredo Ferreira Loureiro:
Can complex network metrics predict the behavior of NBA teams?
695-703
- Daniel David Walker, Eric K. Ringger:
Model-based document clustering with a collapsed gibbs sampler.
704-712
- Pu Wang, Carlotta Domeniconi:
Building semantic kernels for text classification using wikipedia.
713-721
- Michael L. Wick, Khashayar Rohanimanesh, Karl Schultz, Andrew McCallum:
A unified approach for schema matching, coreference and canonicalization.
722-730
- Fei Wu, Raphael Hoffmann, Daniel S. Weld:
Information extraction from Wikipedia: moving down the long tail.
731-739
- Junjie Wu, Hui Xiong, Jian Chen:
SAIL: summation-based incremental learning for information-theoretic clustering.
740-748
- Shan-Hung Wu, Keng-Pei Lin, Chung-Min Chen, Ming-Syan Chen:
Asymmetric support vector machines: low false-positive learning under the user tolerance.
749-757
- Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan:
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme.
758-766
- Yabo Xu, Ke Wang, Ada Wai-Chee Fu, Philip S. Yu:
Anonymizing transaction databases for publication.
767-775
- Jian Yang, Ning Zhong, Yiyu Yao, Jue Wang:
Local peculiarity factor and its application in outlier detection.
776-784
- Luh Yen, Marco Saerens, Amin Mantrach, Masashi Shimbo:
A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances.
785-793
- Chun-Nam John Yu, Thorsten Joachims:
Training structural svms with kernels using sampled cuts.
794-802
- Lei Yu, Chris H. Q. Ding, Steven Loscalzo:
Stable feature selection via dense feature groups.
803-811
- Peng Zhang, Xingquan Zhu, Yong Shi:
Categorizing and mining concept drifting data streams.
812-820
- Xiang Zhang, Fei Zou, Wei Wang:
Fastanova: an efficient algorithm for genome-wide association study.
821-829
- Bin Zhao, Fei Wang, Changshui Zhang:
Cuts3vm: a fast semi-supervised svm algorithm.
830-838
- Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Yung Chang:
Identifying biologically relevant genes via multiple heterogeneous data sources.
839-847
- Wenjun Zhou, Hui Xiong:
Volatile correlation computation: a checkpoint view.
848-856
Industrial papers
- Shyam Boriah, Vipin Kumar, Michael Steinbach, Christopher Potter, Steven A. Klooster:
Land cover change detection: a case study.
857-865
- Mohamed Bouguessa, Benoît Dumoulin, Shengrui Wang:
Identifying authoritative actors in question-answering forums: the case of Yahoo! answers.
866-874
- Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, Hang Li:
Context-aware query suggestion by mining click-through and session data.
875-883
- Christine H. Chih, Douglass S. Parker:
The persuasive phase of visualization.
884-892
- Richard Chow, Philippe Golle, Jessica Staddon:
Detecting privacy leaks using corpus-based association rules.
893-901
- Ying Cui, Jennifer G. Dy, Gregory C. Sharp, Brian M. Alexander, Steve B. Jiang:
Learning methods for lung tumor markerless gating in image-guided radiotherapy.
902-910
- Shantanu Godbole, Shourya Roy:
Text classification, business intelligence, and interactivity: automating C-Sat analysis for services industry.
911-919
- Robert L. Grossman, Yunhong Gu:
Data mining using high performance data clouds: experimental studies using sector and sphere.
920-927
- Shen-Shyang Ho, Ashit Talukder:
Automated cyclone discovery and tracking using knowledge sharing in multiple heterogeneous satellite data.
928-936
- Noam Koenigstein, Yuval Shavitt, Tomer Tankel:
Spotting out emerging artists using geo-aware analysis of P2P query strings.
937-945
- Prem Melville, Saharon Rosset, Richard D. Lawrence:
Customer targeting models using actively-selected web content.
946-953
- Fabian Mörchen, Mathäus Dejori, Dmitriy Fradkin, Julien Etienne, Bernd Wachmann, Markus Bundschus:
Anticipating annotations and emerging trends in biomedical literature.
954-962
- G. Niklas Norén, Andrew Bate, Johan Hopstadius, Kristina Star, I. Ralph Edwards:
Temporal pattern discovery for trends and transient effects: its application to patient records.
963-971
- Nish Parikh, Neel Sundaresan:
Scalable and near real-time burst detection from eCommerce queries.
972-980
- Renuka Sindhgatta:
Identifying domain expertise of developers from source code.
981-989
- Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, Zhong Su:
ArnetMiner: extraction and mining of academic social networks.
990-998
- Leonardo Weiss Ferreira Chaves, Erik Buchmann, Klemens Böhm:
Tagmark: reliable estimations of RFID tags for business processes.
999-1007
- Gang Wu, Brendan Kitts:
Experimental comparison of scalable online ad serving.
1008-1015
- Xintian Yang, Sitaram Asur, Srinivasan Parthasarathy, Sameep Mehta:
A visual-analytic toolkit for dynamic interaction graphs.
1016-1024
- Jieping Ye, Kewei Chen, Teresa Wu, Jing Li, Zheng Zhao, Rinkal Patel, Min Bae, Ravi Janardan, Huan Liu, Gene Alexander, Eric Reiman:
Heterogeneous data fusion for alzheimer's disease study.
1025-1033
- Shipeng Yu, Glenn Fung, Rómer Rosales, Sriram Krishnan, R. Bharat Rao, Cary Dehing-Oberije, Philippe Lambin:
Privacy-preserving cox regression for survival analysis.
1034-1042
- Sai Zeng, Prem Melville, Christian A. Lang, Ioana M. Boier-Martin, Conrad Murphy:
Using predictive analysis to improve invoice-to-cash collection.
1043-1050
- Yi Zhang, Arun C. Surendran, John C. Platt, Mukund Narasimhan:
Learning from multi-topic web documents for contextual advertisement.
1051-1059
Panel
Demonstrations
- Hendrik Blockeel, Toon Calders, Élisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet:
An inductive database prototype based on virtual mining views.
1061-1064
- Peter Christen:
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface.
1065-1068
- Luigi Di Caro, K. Selçuk Candan, Maria Luisa Sapino:
Using tagflake for condensing navigable tag hierarchies from tag clouds.
1069-1072
- Shantanu Godbole, Shourya Roy:
An integrated system for automatic customer satisfaction analysis in the services industry.
1073-1076
- Ming Hua, Jian Pei:
DiMaC: a disguised missing data cleaning tool.
1077-1080
- Evangelos E. Kotsifakos, Irene Ntoutsi, Yannis Vrahoritis, Yannis Theodoridis:
Pattern-Miner: integrated management and mining over data mining models.
1081-1084
- Hongyan Liu, Hui Yang, Wenbo Li, Wei Wei, Jun He, Xiaoyong Du:
CRO: a system for online review structurization.
1085-1088
- Emmanuel Müller, Ira Assent, Ralph Krieger, Timm Jansen, Thomas Seidl:
Morpheus: interactive exploration of subspace clustering.
1089-1092
- Hill Nguyen, Nish Parikh, Neel Sundaresan:
A software system for buzz-based recommendations.
1093-1096
- Shuyi Zheng, Matthew R. Scott, Ruihua Song, Ji-Rong Wen:
Pictor: an interactive system for importing data from a website.
1097-1100
Copyright © Fri Mar 12 17:18:02 2010
by Michael Ley (ley@uni-trier.de)