Page Not Found
Page not found. Your pixels are in another canvas. Read more
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas. Read more
This is a page not in th emain menu Read more
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
. Read more
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool. Read more
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool. Read more
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool. Read more
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool. Read more
Published in CS4984: Special Topics, 2018
Team 16 in the fall 2018 course “CS 4984/5984 Big Data Text Summarization,” in partnership with the University Libraries and the Digital Library Research Laboratory, prepared a corpus of electronic theses and dissertations (ETDs) for … Read more
Recommended citation: Naman Ahuja, Ritesh Bansal, William A. Ingram, Palakh Jude, Sampanna Kahu, and Xinyue Wang. "Big Data Text Summarization: Using Deep Learning to Summarize Theses and Dissertations." http://hdl.handle.net/10919/86406
Published in CNI: Coalition for Networked Information Fall 2019 Membership Meeting, 2019
Virginia Polytechnic Institute and State University (Virginia Tech) Libraries, in collaboration with Virginia Tech Department of Computer Science and Old Dominion University Department of Computer Science, is the recipient of an IMLS National Leadership Grant for Libraries award… Read more
Recommended citation: William A. Ingram. Bringing Computational Access to Book-length Documents Via an ETD Pilot. CNI: Coalition for Networked Information Fall 2019 Membership Meeting. December 9-10, 2019. Washington, DC. https://www.cni.org/topics/electronic-theses-dissertations-etds/bringing-computational-access-to-book-length-documents-via-an-etd-pilot
Published in ETD Conference 2019, Porto, Portugal, 2020
Inspired by the millions of Electronic Theses and Dissertations (ETDs) openly available online, we describe a novel use of ETDs as data for text summarization. We use a large corpus of ETDs to evaluate techniques for … Read more
Recommended citation: William A. Ingram, Bipasha Banerjee, and Edward A. Fox. 2020. Summarizing ETDs with deep learning. Cadernos de Biblioteconomia, Arquivística e Documentação 1 (Mar. 2020), 46–52. https://doi.org/10.48798/cadernosbad.2014 https://bad.pt/publicacoes/index.php/cadernos/article/viewFile/2014/pdf
Published in Data and Information Management, 2020
Natural language processing (NLP) covers a large number of topics and tasks related to data and information management, leading to a complex and challenging teaching process. Meanwhile, problem-based learning is a teaching… Read more
Recommended citation: Liuqing Li, Jack H. Geissinger, William A. Ingram, and Edward A. Fox. 2020. Teaching Natural Language Processing through Big Data Text Summarization with Problem-Based Learning. Data and Information Management 4, 1 (2020), 18–43. https://doi.org/10.2478/dim-2020-0003 https://doi.org/10.2478/dim-2020-0003
Published in CS6604: Digital Libraries, 2020
In recent years, advances in natural language processing, machine learning, and neural networks have led to powerful tools for digital libraries, allowing library collections to be discovered, used, and reused in exciting new ways. … Read more
Recommended citation: John Aromando, Bipasha Banerjee, William A. Ingram, Palakh Jude, and Sampanna Kahu. 2020. "Classification and extraction of information from ETD documents." http://hdl.handle.net/10919/96645
Published in VtechWorks: Viginia Tech ETD, 2020
Great progress has been made to leverage the improvements made in natural language processing and machine learning to better mine data from journals, … Read more
Recommended citation: Palakh Mignonne Jude. June, 2020. Increasing Accessibility of Electronic The- ses and Dissertations (ETDs) Through Chapter-level Classification. MS thesis, Computer Science, Virginia Tech (June, 2020). http://hdl.handle.net/10919/99294 http://hdl.handle.net/10919/99294
Published in ACM/IEEE Joint Conference on Digital Libraries in 2020, 2020
This is a paper for the poster which has been accepted to ACM/IEEE Joint Conference on Digital Libraries 2020 and recieved Best Poster Award Honorable Mention. Read more
Recommended citation: Muntabir Hasan Choudhury, Jian Wu, William A. Ingram, and Edward A. Fox. 2020. A Heuristic Baseline Method for Metadata Extraction from Scanned Electronic Theses and Dissertations. In JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. ACM, 515–516. https://doi.org/10.1145/3383583.3398590 https://dl.acm.org/doi/10.1145/3383583.3398590
Published in VtechWorks: Viginia Tech ETD, 2020
The ability to extract figures and tables from scientific documents can solve key use-cases such as their semantic parsing, summarization, or indexing. … Read more
Recommended citation: Sampanna Yashwant Kahu. 2020. Figure Extraction from Scanned Electronic Theses and Dissertations. Thesis. Virginia Tech. https://vtechworks.lib.vt.edu/handle/ 10919/100113 http://hdl.handle.net/10919/100113
Published in CNI: Coalition for Networked Information Fall 2020 Membership Meeting, 2020
Our ongoing research project applies computational analysis and text mining techniques to a large corpus of electronic theses and dissertations (ETDs) in order to gain insight into the evolution of graduate research topics. … Read more
Recommended citation: William A. Ingram. Mining ETDs for Trends in Graduate Research. CNI: Coalition for Networked Information Fall 2020 Membership Meeting, November 12, 2020. Virtual. https://www.cni.org/topics/electronic-theses-dissertations-etds/mining-etds-for-trends-in-graduate-research
Published in ACM/IEEE Joint Conference on Digital Libraries in 2021, 2021
We focus on electronic theses and dissertations (ETDs), aiming to improve access and expand their utility, since more than 6 million are publicly available, and they constitute an important corpus to … Read more
Recommended citation: Sampanna Yashwant Kahu, William A. Ingram, Edward A. Fox, and Jian Wu. 2021. ScanBank: A Benchmark Dataset for Figure Extraction from Scanned Electronic Theses and Dissertations. In ACM/IEEE Joint Conference on Digital Libraries, JCDL 2021, Champaign, IL, USA, September 27-30, 2021. IEEE, 180–191. https://doi.org/ 10.1109/JCDL52503.2021.00030 https://doi.org/10.1109/JCDL52503.2021.00030
Published in ACM/IEEE Joint Conference on Digital Libraries in 2021, 2021
Electronic Theses and Dissertations (ETDs) contain domain knowledge that can be used for many digital library tasks, such as analyzing citation networks and predicting research … Read more
Recommended citation: Muntabir Hasan Choudhury, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, and Edward A. Fox. 2021. Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations. In ACM/IEEE Joint Conference on Digital Libraries, JCDL 2021, Champaign, IL, USA, September 27-30, 2021. IEEE, 230–233. https://doi.org/10.1109/JCDL52503.2021.00066. https://doi.org/10.1109/JCDL52503.2021.00066
Published in ETD conference, 2021
Theses and dissertations contain a wealth of knowledge reflecting graduate students exploration in a scholarly domain. Although print submission was common practice early on.. Read more
Recommended citation: Bipasha Banerjee, William A. Ingram, Jian Wu, and Edward A. Fox. 2021. Ap- plications of Mining ETDs. In 24th International Symposium on Electronic The- ses and Dissertations (ETD 2021), November 15-17, 2021, United Arab Emirates. https://doi.org/10.26226/morressier.614c9b8c87a68d83cb5d59b2 https://doi.org/10.26226/morressier.614c9b8c87a68d83cb5d59b2
Published in IEEE International Conference on Big Data (Big Data), 2021
In this work, we report our progress on building a collection containing over 450k Electronic Theses and Dissertations (ETDs), including full-text and metadata … Read more
Recommended citation: Sami Uddin, Bipasha Banerjee, Jian Wu, William A. Ingram, and Edward A. Fox. 2021. Building A Large Collection of Multi-domain Electronic Theses and Dissertations. In 2021 IEEE International Conference on Big Data (Big Data), Or- lando, FL, USA, December 15-18, 2021. IEEE, 6043–6045. https://doi.org/10.1109/ BigData52589.2021.9672058 https://doi.org/10.1109/BigData52589.2021.9672058
Published in Twenty-Fourth International Conference on Grey Literature, 2022
A conference talk on scholarly big data corpus Read more
Recommended citation: William A. Ingram, Jian Wu, and Edward A. Fox. 2022. Electronic Theses and Dissertations: A Research Corpus of Scholarly Big Data. GreyNet International. https://doi.org/10.5446/59869 https://doi.org/10.5446/59869
Published in Companion Proceedings of the ACM Web Conference 2022, 2022
Datasets and software packages are considered important resources that can be used for replicating computational experiments. … Read more
Recommended citation: Lamia Salsabil, Jian Wu, Muntabir Hasan Choudhury, William A. Ingram, Ed- ward A. Fox, Sarah Michele Rajtmajer, and C. Lee Giles. 2022. A Study of Com- putational Reproducibility using URLs Linking to Open Access Datasets and Software. In Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25 - 29, 2022. ACM, 784–788. https://doi.org/10.1145/3487553.3524658 https://doi.org/10.1145/3487553.3524658
Published in 25th International Symposium on Electronic Theses and Dissertations - ETD 2022, Novi Sad, Serbia September 7 - 9, 2022, 2022
Electronic theses and dissertations (ETDs) contain valuable knowledge that can be useful in a wide range of research areas… Read more
Recommended citation: AmanAhuja, William A. Ingram, Chenyu Mao, Chongyu He, Jianchi Wei,and Edward A. Fox. 2022. Analyzing and Navigating ETDs Using Topic Models. In 25th International Symposium on Electronic Theses and Dissertations (ETD 2022), September 7-9, 2022, Novi Sad, Serbia. https://hdl.handle.net/10919/109986
Published in Proceedings of the first Workshop on Information Extraction from Scientific Publications, 2022
Electronic theses and dissertations (ETDs) contain valuable knowledge that can be useful for a wide range of purposes. To effectively utilize the knowledge contained in ETDs … Read more
Recommended citation: Aman Ahuja, Alan Devera, and Edward Alan Fox. 2022. Parsing Electronic Theses and Dissertations Using Object Detection. In Proceedings of the first Workshop on Information Extraction from Scientific Publications. Association for Computational Linguistics, Online, 121–130. https://aclanthology.org/2022.wiesp-1.14 https://aclanthology.org/2022.wiesp-1.14/
Published in 2022 IEEE International Conference on Big Data (Big Data), 2022
Theses and dissertations record the work of graduate students and are typically a requirement at the culmination of the graduate degree … Read more
Recommended citation: Bipasha Banerjee, William A. Ingram, Jian Wu, and Edward A. Fox. 2022. Applications of data analysis on scholarly long documents. In IEEE International Conference on Big Data, Big Data 2022, Osaka, Japan, December 17-20, 2022. IEEE, 2473–2481. https://doi.org/10.1109/BigData55660.2022.10020935 https://10.1109/BigData55660.2022.10020935
Published in VtechWorks: Viginia Tech ETD, 2023
Electronic theses and dissertations (ETDs) are structured documents in which chapters are major components.… Read more
Recommended citation: Javaid Akbar Manzoor. 2023. Segmenting Electronic Theses and Dissertations By Chapters. Thesis. Virginia Tech. https://vtechworks.lib.vt.edu/handle/10919/113246 http://hdl.handle.net/10919/113246
Published in Companion Proceedings of the ACM Web Conference 2023, 2023
Parsing long documents, such as books, theses, and dissertations, is an important component of information extraction from scholarly documents…. Read more
Recommended citation: Aman Ahuja, Kevin Dinh,Brian Dinh, William A. Ingram, and Edward Fox. 2023. A New Annotation Method and Dataset for Layout Analysis of Long Documents. In Companion Proceedings of the ACM Web Conference 2023 (Austin, TX, USA) (WWW ’23 Companion). Association for Computing Machinery, New York, NY, USA, 834—-842. https://doi.org/10.1145/3543873.3587609. https://dl.acm.org/doi/abs/10.1145/3543873.3587609