cv
Basics
| Name | Abdelrahman Abdallah |
| Label | Ph.D. Candidate · Machine Learning & NLP |
| abdelrahman.abdallah@uibk.ac.at | |
| Phone | +20 11 1137 1734 |
| Url | https://abdoelsayed2016.github.io/ |
| Summary | Machine learning engineer and final-year Ph.D. candidate with experience in natural language processing, information retrieval, and computer vision. Passionate about building LLM-based systems, OCR and QA pipelines, and tools that support both research and real-world applications. |
Work
-
2022.10 - Present Research Assistant
Digital Science Center (DiSC), University of Innsbruck
Research assistant working on NLP and information retrieval within the Digital Science Center.
- Knowledge extraction and information retrieval from unstructured text documents.
- Methods for natural language processing and information retrieval.
- Application of text mining methods to the field of digital history.
-
2022.01 - 2022.10 Machine Learning Researcher
Università Ca' Foscari
Researcher working on ML models for climate change and risk assessment.
- Comparative analysis of graph neural networks and random forests for climate-related tasks.
- Contributed to a review paper on ML/AI models for risk assessments.
- Surveyed graph neural networks for spatio-temporal data.
-
2021.08 - 2022.06 Machine Learning Engineer
KMG Engineering
Machine learning engineer focusing on vision and NLP applications.
- Developed GAN-based models for image inpainting.
- Built English grammar correction models using deep learning.
- Worked on curve detection and tracking tasks.
-
2021.08 - 2025.05 Machine Learning Engineer (part-time)
DISCO App
Machine learning engineer working on OCR and NLP for digital receipts.
- Worked on receipt extraction, OCR systems, and NLP components.
- Built and improved OCR accuracy for receipts and downstream information extraction and classification.
- Contributed to an application deployed on the Google Play Store.
-
2019.11 - 2021.06 Machine Learning Researcher
National Open Research Laboratory for Information and Space Technologies, Satbayev University
Researcher focusing on handwriting recognition, table detection, and document analysis.
- Built handwritten Kazakh and Russian databases for handwriting recognition research.
- Reviewed recent approaches to handwritten recognition for Cyrillic characters.
- Created a table detection dataset and developed models for table detection and classification.
-
2019.06 - 2019.11 Software Developer
CCC at Limkokwing University
Software developer working on web applications with database backends.
- Developed and implemented a scanning component using PHP and MySQL.
- Designed databases and table structures using n-tier architecture for web applications.
- Built web applications using the ExpressionEngine PHP framework.
-
2016.07 - 2019.06 Research and Teaching Assistant
Assiut University, Faculty of Computers and Information
Research and teaching assistant in Computer Science.
- Taught classes, supervised laboratory sessions, and graded assignments and projects.
- Represented teams in meetings with executives to discuss project goals and milestones.
- Kept up with emerging technologies and applied them to teaching and research projects.
-
2016.01 - 2017.08 Web Developer
FastKood Company
Web developer building data-driven websites and internal tools.
- Converted UI mockups into HTML, JavaScript, AJAX, and JSON.
- Worked with UNIX and Apache servers.
- Developed data architecture designs for targeted customer analysis.
- Created workflow charts and diagrams to support production teams and meet client deadlines.
-
2015.06 - 2016.01 Software Developer
Overcoffeesolutions
Software developer focusing on object-oriented applications.
- Developed object-oriented software and intuitive graphical user interfaces.
- Implemented scanning components using MySQL and solid database design.
- Built multiple web applications following n-tier architecture.
Volunteer
-
2023.01 - Present Remote
Reviewer
Conference and Journal Reviewer
Reviewer for major conferences and journals in NLP, IR, and computer vision.
- Conference reviewer: ACL, SIGIR, COLING, EMNLP, LREC-COLING, WACV.
- Journal reviewer: Pattern Recognition Letters, IET NBT, Heliyon, IET Signal Processing.
Education
-
2022.10 - Present Innsbruck, Austria
-
2019.09 - 2021.06 Almaty, Kazakhstan
MSc
Satbayev University, Faculty of Information and Telecommunication Technologies
Data Science and Machine Learning
-
2016.09 - 2017.06 Assiut, Egypt
-
2011.09 - 2015.06 Assiut, Egypt
Awards
- 2019.09.01
Scholarship for Master's Studies
Satbayev University
Scholarship to study for a Master's degree in Data Science and Machine Learning at Satbayev University.
Certificates
| NAACL 2025 Certificate | ||
| NAACL | 2025-01-01 |
Publications
-
2026 [C22] MM-BRIGHT: A Multi-Task Multimodal Benchmark for Reasoning-Intensive Retrieval
KDD 2026
Abdelrahman Abdallah, Mohamed Darwish Mounis, Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Mostafa Farouk Senussi, Mohamed Mahmoud, Mohammed Ali, Adam Jatowt, Hyun-Soo Kang.
-
2026 [C21] Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation
ACL Demo 2026
Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Andreas Herzinger, Jamie Holdcroft, Adam Jatowt.
-
2026 [C20] BracketRank: Large Language Model Document Ranking via Reasoning-based Competitive Elimination
ACL 2026
Abdelrahman Abdallah, Mohammed Ali, Bhawna Piryani, Adam Jatowt.
-
2026 [C19] Negative Sampling Techniques in Dense Retrieval: A Survey
Findings of EACL 2026
Laurin Wischounig*, Abdelrahman Abdallah*, Adam Jatowt. *Equal contribution.
-
2026 [C18] It's High Time: A Survey of Temporal Information Retrieval and Question Answering
ACL 2026
Bhawna Piryani, Abdelrahman Abdallah, Jamshid Mozafari, Avishek Anand, Adam Jatowt.
-
2026 [C17] RECOR: Reasoning-focused Multi-turn Conversational Retrieval Benchmark
Findings of ACL 2026
Mohammed Ali, Abdelrahman Abdallah, Amit Agarwal, Hitesh Laxmichand Patel, Adam Jatowt.
-
2026 [C16] Are LLM-Based Retrievers Worth Their Cost? An Empirical Study of Efficiency, Robustness, and Reasoning Overhead
SIGIR 2026
Abdelrahman Abdallah, Jamie Holdcroft, Mohammed Ali, Adam Jatowt.
-
2026 [C15] TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions
WSDM 2026
Abdelrahman Abdallah, Bhawna Piryani, Jonas Wallat, Avishek Anand, Adam Jatowt.
-
2025 [C14] Evaluating Robustness of LLMs in Question Answering on Multilingual Noisy OCR Data
CIKM 2025
Bhawna Piryani, Jamshid Mozafari, Abdelrahman Abdallah, Antoine Doucet, Adam Jatowt.
-
2025 [C13] RerankArena: A Unified Platform for Evaluating Retrieval, Reranking and RAG with Human and LLM Feedback
CIKM 2025
Abdelrahman Abdallah, Mahmoud Abdalla, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt.
-
2025 [C12] ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering
EMNLP 2025
Raphael Gruber, Abdelrahman Abdallah, Michael Färber, Adam Jatowt.
-
2025 [C11] How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
Findings of EMNLP 2025
Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt.
-
2025 [C10] DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
Findings of EMNLP 2025
Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt.
-
2025 [C9] Wrong Answers Can Also Be Useful: PlausibleQA - A Large-Scale QA Dataset with Answer Plausibility Scores
SIGIR 2025
Jamshid Mozafari, Abdelrahman Abdallah, Bhawna Piryani, Adam Jatowt.
-
2025 [C8] A Study into Investigating Temporal Robustness of LLMs
Findings of ACL 2025
Jonas Wallat, Abdelrahman Abdallah, Adam Jatowt, Avishek Anand.
-
2025 [C7] ASRank: Zero-Shot Re-Ranking with Answer Scent for Document Retrieval
Findings of NAACL 2025
Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt.
-
2025 [C6] DynRank: Improve Passage Retrieval with Dynamic Zero-Shot Prompting Based on Question Classification
COLING 2025
Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Mohammed M. Abdelgwad, Adam Jatowt.
-
2025 [C5] CascadePLS-ViT: Cascade with Patch-Level Self-Supervised Vision Transformers for Breast Cancer Classification in Mammography
ISBI 2025
Abdelrahman Abdallah, Mahmoud SalahEldin Kasem, Ibrahim Abdelhalim, Norah Saleh Alghamdi, Sohail Contractor, Ayman El-Baz.
-
2024 [C4] ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
SIGIR 2024
Abdelrahman Abdallah, Mahmoud Kasem, Mahmoud Abdalla, Mohamed Mahmoud, Mohamed Elkasaby, Yasser Elbendary, Adam Jatowt.
-
2024 [C3] IHRRB-DINO: Identifying High-Risk Regions of Breast Masses in Mammogram Images Using Data-Driven Instance Noise
MICCAI 2024
Mahmoud SalahEldin Kasem*, Abdelrahman Abdallah*, Ibrahim Abdelhalim*, Norah Saleh Alghamdi, Sohail Contractor, Ayman El-Baz. *Equal contribution.
-
2024 [C2] Detecting Temporal Ambiguity in Questions
Findings of EMNLP 2024
Bhawna Piryani, Abdelrahman Abdallah, Jamshid Mozafari, Adam Jatowt.
-
2024 [C1] Exploring Hint Generation Approaches for Open-Domain Question Answering
Findings of EMNLP 2024
Jamshid Mozafari, Abdelrahman Abdallah, Bhawna Piryani, Adam Jatowt.
Skills
| Machine Learning & Data Science | |
| Machine Learning | |
| Deep Learning | |
| Natural Language Processing | |
| Information Retrieval | |
| Open-Domain Question Answering | |
| Large Language Models |
| Computer Vision | |
| Handwritten Text Recognition | |
| OCR | |
| Object Detection | |
| Generative Adversarial Networks | |
| Image Retrieval | |
| Image Processing | |
| Image Segmentation |
| Programming Languages | |
| Python | |
| PHP | |
| Java |
| ML & DL Frameworks | |
| PyTorch | |
| TensorFlow | |
| Keras | |
| scikit-learn |
| Web Development | |
| Laravel | |
| HTML | |
| CSS | |
| JavaScript | |
| jQuery |
| Tools | |
| PyCharm | |
| Anaconda | |
| Jupyter Notebook | |
| Git | |
| Linux |
Languages
| Arabic | |
| Native |
| English | |
| Fluent (Duolingo: 140) |
Interests
| Natural Language Processing | ||||||
| Large Language Models | ||||||
| Information Retrieval | ||||||
| Open-Domain Question Answering | ||||||
| Keyword Information Extraction | ||||||
| Text Generation | ||||||
| Computer Vision | ||||||
| Handwriting Recognition | ||||||
| OCR | ||||||
| Medical Imaging | ||||||
| Image Retrieval | ||||||
| Segmentation | ||||||
References
| References available upon request | |
| Academic and professional references can be provided upon request. |
Projects
- 2024.01 - Present
Rankify
Creator and maintainer of Rankify, a Python toolkit for retrieval, reranking, and RAG evaluation.
- Comprehensive Python package for information retrieval and reranking evaluation.
- Supports evaluation of retrieval, reranking, and RAG systems with automated metrics and human feedback.
- Integrated with the RankArena platform and has 500+ GitHub stars.
- 2024.01 - Present
RankArena
Lead developer of RankArena, a unified web platform for evaluating retrieval, reranking, and RAG systems.
- Provides standardized protocols for benchmarking IR models against strong baselines.
- Supports both human and LLM-based feedback for evaluation.
- Accepted at CIKM 2025.