7th International Workshop on Historical Document Imaging and Processing (HIP’23)
August 25-26, 2023, San José, California, USA
August 25-26, 2023, San José, California, USA
There have been increased efforts worldwide to digitize our cultural heritage conveyed in historical documents. In this workshop, we bring together researchers from various fields working on document image acquisition, restoration, analysis, indexing, and retrieval to make these documents accessible in digital libraries.
It is the seventh satellite workshop of ICDAR dedicated to this topic, following HIP’11 in Beijing, HIP’13 in Washington, HIP’15 in Nancy, HIP’17 in Kyoto and HIP’19 in Sydney and HIP’21 in Lausanne (hybrid) that were a significant success with strong participation. The workshop is planned for 1½-days with oral presentations on August 25th, and a guided tour to the Computer History Museum on August 26th. Each submission will undergo peer-review and distinguished submissions will be presented orally.
HIP aims to provide the researchers with a forum that is complementary and synergetic to the main sessions at ICDAR on document analysis and recognition. The manifold topics addressed in this workshop encompass the entire processing chain from image acquisition to information extraction. We include the growing importance of machine learning in this processing chain, and we encourage the presentation of entire projects in the context of historical documents.
UPDATE:
The submission deadline has been extended by two weeks to May 12, 2023, AoE (strict).
There will be no further extensions granted. The deadline for Camera Ready remains the same.UPDATE-2:
The registration information for ICDAR2023 has been published: https://icdar2023.org/registration/.Please note that for every accepted paper at HIP’23, at least one author needs to register either for the
„Full Conference Pass“ or „Post Conference Pass“ for participation in the workshop.Furthermore, due to room availability, the workshop date has been moved from August 24th-25th
to August 25th-26th, with the main workshop on August 25th and the 1/2-day excursion
(to be confirmed) on August 26th.
Submission Deadline: April 28, 2023 May 12, 2023 (Time zone: Anywhere on Earth)
Acceptance Notification: June 1, 2023 June 16, 2023
Camera Ready: July 9, 2023
Workshop: August 24, 2023 August 25th, 2023
Excursion: August 25, 2023 August 26th, 2023
Venue: Adobe World Headquarters, 345 Park Avenue, San Jose, California, USA
It is our pleasure to announce that the 7th International Workshop on Historical Document Imaging and Processing (HIP’23) will be held in conjunction with ICDAR2023, on August 25th, 2023 in San José, USA.
The workshop brings together researchers working with historical documents and intends to be complementary and synergistic to the work in analysis and recognition featured in the main sessions of ICDAR, the premier international forum for researchers and practitioners in the document analysis community.
Submissions are received until May 12, 2023 (Time zone: Anywhere on Earth) via Easychair and undergo review by the members of the Program Committee.
Papers must not exceed 6 pages in length (including references). It is not required to anonymize the submission, but authors are welcome to do so if they prefer it. Acceptance notifications will be sent out June 16, 2023.
To prepare your camera-ready submissions, please follow the instructions provided per ACM email and upload your submission files to the TAPS system by July 9, 2023 for publication in the ACM Digital Library (see also the Proceedings of previous editions).
Authors should use the current ACM SIG Conference Proceedings Template (specifically the ’sample-sigconf‘ template) to prepare their papers. Note that ACM has created a new LaTeX template and updated the existing Word templates. For Overleaf, the according template can be found here. Authors also need to apply the ACM Computing Classification System (CCS) according to the ACM SIG Conference Proceedings Template.
Workshop topics include (but are not limited to):
Imaging and Image Acquisition
Document Restoration/Improving readability
Document Content Acquisition
Family History Documents and Genealogies
Automated Classification, Grouping and Hyperlinking of Historical Documents
Digital Humanities applications of document analysis and recognition
Natural Language Processing for Historical Documents
For work focusing on handwriting/paleography, we recommend you have a look at IWCP2023.
Clemens Neudecker
Berlin State Library
Directorate General
Potsdamer Strasse 33
10785 Berlin
Germany
clemens.neudecker@sbb.spk-berlin.de
Apostolos Antonacopoulos
PRImA Research Lab
School of Science, Engineering & Environment
University of Salford
Greater Manchester M5 4WT
United Kingdom
a.antonacopoulos@primaresearch.org
Maud Ehrmann
EPFL CDH DHI DHLAB
INN 116 (Bâtiment INN)
Station 14
CH-1015 Lausanne
Switzerland
maud.ehrmann@epfl.ch
Christian Clausner
PRImA Research Lab
School of Science, Engineering & Environment
Newton Building
University of Salford
Greater Manchester M5 4WT
United Kingdom
c.clausner@primaresearch.org
Kai Labusch
Berlin State Library
Information and Data Management
Potsdamer Strasse 33
10785 Berlin
Germany
kai.labusch@sbb.spk-berlin.de
Randy Wilson
FamilySearch
3201 Garden Dr.
Lehi, UT 84043
USA
WilsonR@familysearch.org
William Barrett
Department of Computer Science
Brigham Young University
Provo, Utah 84604
USA
barrett@cs.byu.edu
09h00-09h10 | Welcome message | |
SESSION 1: HTR and Multi-Modal Methods (Chair: Clemens Neudecker) | ||
09h10-09h25 | Handwritten Text Recognition from Crowdsourced Annotations | Solène Tarride, Tristan Faine, Mélodie Boillet, Harold Mouchère and Christopher Kermorvant |
09h25-09h40 | An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts | Mohamed Ali Souibgui, Pau Torras, Jialuo Chen and Alicia Fornés |
09h40-09h55 | Drawing the Line: A Dual Evaluation Approach for Shaping Ground Truth in Image Retrieval Using Rich Visual Embeddings of Historical Images | David Tschirschwitz, Franziska Klemstein, Henning Schmidgen and Volker Rodehorst |
09h55-10h10 | Gauging the Limitations of Natural Language Supervised Text-Image Metrics Learning by Iconclass Visual Concepts | Kai Labusch and Clemens Neudecker |
10h10-10h25 | Two-step sequence transformer based method for Cham to Latin script transliteration | Tien Nam Nguyen, Jean-Christophe Burie, Thi-Lan Le and Anne-Valérie Schweyer |
10h25-10h45 | Break | |
SESSION 2: Classics (Chair: Irina Rabaev) | ||
10h45-11h00 | Feature Mixing for Writer Retrieval and Identification on Papyri Fragments | Marco Peer and Robert Sablatnig |
11h00-11h15 | Homer restored: Virtual reconstruction of Papyrus Bodmer 1 | Simon Perrin, Léopold Cudilla, Yejing Xie, Harold Mouchère and Isabelle Marthot-Santaniello |
11h15-11h30 | PapyTwin net: a Twin network for Greek letters detection on ancient Papyri | Manh Tu Vu and Marie Beurton-Aimar |
11h30-11h45 | Study of historical Byzantine seal images: the BHAI project for computer-based sigillography |
Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, Attilio Fiandrotti, Beatrice Caseau and Isabelle Bloch
|
11h45-12h00 | Classifying The Scripts of Aramaic Incantation Bowls |
Said Naamneh, Nour Atamni, Boraq Madi, Shoshana Boardman, Daria Vasyutinsky Shapira, Irina Rabaev Rabaev and Jihad El-Sana
|
12h00-12h20 | Discussion | |
12h20-13h30 | Lunch | |
SESSION 3: Segmentation & Layout Analysis (Chair: Apostolos Antonacopoulos) | ||
13h30-13h45 | DIVA-DAF: A Deep Learning Framework for Historical Document Image Analysis | Lars Vögtlin, Anna Scius-Bertrand, Paul Maergner, Andreas Fischer and Rolf Ingold |
13h45-14h00 | Laypa: A Novel Framework for Applying Segmentation Networks to Historical Documents | Stefan Klut, Rutger van Koert and Ronald Sluijter |
14h00-14h15 | Document Layout Analysis with Deep Learning and Heuristics | Vahid Rezanezhad, Konstantin Baierer, Mike Gerber, Kai Labusch and Clemens Neudecker |
14h15-14h30 | A hybrid CNN-Transformer model for Historical Document Image Binarization | Vahid Rezanezhad, Konstantin Baierer and Clemens Neudecker |
14h30-15h00 | Break | |
SESSION 4: Language Technologies & Classification (Chair: Vincent Christlein) | ||
15h00-15h15 | Enhancing Named Entity Recognition for Holocaust Testimonies through Pseudo Labeling and Transformer-based Models | Isuri Anuradha, Le Ha, Ruslan Mitkov and Johannes-Dieter Steinert |
15h15-15h30 | NAME – A Rich XML Format for Named Entity and Relation Tagging | Christian Clausner, Stefan Pletschacher and Apostolos Antonacopoulos |
15h30-15h45 | Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents | Johan Zenk, Florian Kordon, Martin Mayr, Mathias Seuret and Vincent Christlein |
15h45-16h00 | DocLangID: Improving Few-Shot Training to Identify the Language of Historical Documents | Furkan Simsek, Brian Pfitzmann, Hendrik Raetz, Jona Otholt, Haojin Yang and Christoph Meinel |
16h00-16h30 | Discussion and wrap-up |
10am – 12pm: Workshop excursion to the Computer History Museum (transportation to museum self paid).
For general enquiries about HIP’23 please contact the organizers.
Copyright © HIP’23 Organizing Committee
Web hosting provided by the Berlin State Library.