Oxford University Press (OUP) is a department of the University of Oxford which furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide.
The Dictionaries Division publishes the flagship online products Oxford English Dictionary (OED) and Oxforddictionaries.com, leads innovation in digital lexical publishing and licensing working with the world's largest technology and information providers, and launches new initiatives, including the Oxford Global Languages (OGL) Programme, which develops digital lexical resources for under-resourced languages worldwide.
The Dictionaries Technology Group within the Dictionaries Division is responsible for designing and deploying technical solutions in order to develop language data and new business capabilities around lexical content and linguistic data more broadly.
ABOUT THE ROLE
We have an exciting opportunity for a developer in the fields of natural language processing, computational linguistics or language technology, to work in a programme dedicated to the development of linguistic corpora and related resources, for a wide range of languages.
Proven expertise in software development for the creation of natural language processing (NLP) tools or language resources (e.g., corpora, lexicons) is basic. We expect you to enjoy problem-solving and sharing what you have learnt, as well as learning from your peers. You should appreciate working with a variety of technological resources and from different methodologies, and have enough experience to pick the appropriate tool for the issue at hand.
You will work in a cross-functional Agile team and be part of an expanding digital business. You will have the opportunity to build professional experience in a technology team that enjoys the advantages of working in a start-up culture but within a solid and stable company. Moreover, you'll able to benefit from seeing the real time impact of your successful delivery.
• Developing, testing, and documenting new or customized NLP software components, from supplied specifications and in accordance with agreed standards
• Deploying NLP pipelines for languages of very different types
• Deploying or customizing tools for tasks of manual annotation
• Contributing solutions to each task by assessing and selecting best-fit approaches, libraries and services
• Collaborating in team-based standards for programming tools and techniques
• Monitoring the field to gain knowledge and understanding of emerging data technologies and techniques
• Working off-site is negotiable under certain conditions, although our preferred option is on-site.
To be successful in this role you must have:
• A degree in Computer Science, Artificial Intelligence, or similar; or several years of experience in the field
• Expertise in one or more areas within natural language processing (NLP), such as POS tagging, lemmatization, parsing, semantic analysis.
• Familiarity with corpus building projects
• Experience with quantitative and machine learning-based approaches to linguistic data
• Good programming skills, with an acceptable degree of performance in Python and Java.
• Experience in the Apache UIMA Architecture will be highly valued. However, familiarity with other NLP frameworks (e.g., Gate, NLTK) will be also accepted.
• Strong written and verbal communication
• Effectiveness in the transfer of technical expertise to colleagues
• Interest in expanding knowledge and learning new skills.
• Some familiarity with Machine Translation
• Some knowledge of APIs, relational and noSQL databases
• Some familiarity with frameworks and approaches for working with big data (e.g., Hadoop, MapReduce, etc.)
• Competency in Git, Subversion, or similar source control and dependency management
• Previous work with Agile methodologies
If you have the talent and desire to participate in a team to deliver high quality and innovative solutions, and to add some industrial expertise to your CV, please apply to join us.
Closing date: 7 August 2018