<< Job précédent / Previous Job -- Job Suivant / Next Job >>


CDI 2019 Performance and modelling internship - 5-6 months, Interns/Students, Hardware Engineering

chez Arm à Sophia-Antipolis en categorie electronique-hardware

Are you creative, innovative, and enthusiastic about new technologies? Do you enjoy programming and have a knack for problem solving?At Arm you will shape the future of technology and collaborate in the development of next-generation CPUs to power billions of devices worldwide.Internship offers at Arm Sophia AntipolisWe have exciting opportunities in the CPU hardware groups. We develop mainstream processors ranging from high-performance cores to low-power secure micro-controllers.At Arm we empower our talented individuals to work together as a team to push the boundaries of what is possible!As an intern you will directly influence the development of hardware IP that is used extensively in a wide variety of devices, from mobile phones and tablets to sensors to servers.You will be able to work closely with our talented and expert engineers to help design groundbreaking technology. You will gain knowledge and tackle challenges while being able to encounter opportunities to work on all aspects of product development.You will work on real projects from day one, with help and guidance from expert engineers. Through teamwork, and dedication to personal development, we ensure that every intern learns about different aspects of our work, and becomes knowledgeable in the field.What will your role be?Depending on the subject:You will help analysing trade-offs between different options via modelling in C or C++ or a hardware description language (HDL).You will develop IP in HDL, working with the rest of the design team to deliver a product with leading power efficiency and performance.You will verify IP to the highest quality standards using a wide range of methodologies like constrained random simulation using test benches written in SystemVerilog, running real applications on emulation or FPGA platforms, and using formal methods.Several subjects are available and described below. Choose the one(s) you prefer and indicate them in your application. Only one student will work on a given subject, followed by a dedicated experienced engineer.Subjects details [2019-P-1] Detect and optimize critical loads in an Out of Order processor Memory system is known to be the one of the most critical performance bottleneck in the latest generations of processors. As a complement to existing techniques aiming at mitigating this issue, such as cache memories and prefetchers, this internship focuses on identifying and prioritizing the most critical loads in a processor out-of-order pipeline. You will start with ramping up on the CPU micro-architecture, with a particular focus on its memory subsystem, along with performing a short bibliography study on critical load optimizations. You will then select the most promising solution(s) and implement it in an existing in-house CPU model, initially focusing on the identification of such loads, then on the possible modifications to optimize them. In a final stage, a benchmarking campaign will be launched on the modified CPU model to showcase the performance improvements on a wide range of workloads. [2019-P-2] Enable system level virtualization studies Many operating systems or applications are running in virtualized environments, and many modern CPUs implement dedicated hardware to improve the overall performance and efficiency of virtualization. The goal of this internship is to enable the performance analysis of such virtualized workloads from a CPU point of view. You will first study and upgrade an existing in-house performance model, comprising a modern out-of-order CPU along with key system components, to benefit from the latest improvements of the Arm architecture. In a second stage, you will conduct performance analysis campaigns to evaluate and quantify those benefits on a various range of virtualized workloads. [2019-P-3] Develop timing back-annotation infrastructure in QEMU QEMU is an open source technology that (amongst many other capabilities) emulates several instructions sets from various architectures (ARM, x86, …). The goal of the internship is to study and implement a solution to enable timing back-annotation information to the execution of instructions (or groups of instructions) so as to provide high-level estimation of the performance of an emulated program. After developping and setting up the infrastructure in QEMU, you will validate your work on a real-case study, correlating your results on the latest Cortex-A processor under development in the centre against other in-house performance models. [2019-P-4] Develop GPU black-box to improve system performance analysis accuracy Taking into account the impact of all components in the system is usually key when studying the performance of a processor. In system modelling, it is however very common to implement them as simple black-boxes, limiting their functionality to the bare minimum. Our processor performance analysis team is using several such modelling and simulation environments, keeping them up to date with the latest generation of system IPs. The goal of this internship is to improve some of those existing system models by developping and adding a GPU black-box model. You will start by defining and developping the GPU black-box model, before implementing it in the various system environments, from software models to hardware emulation. Once available, you will run workloads in those updated environments to validate the integration and analyse the impact of the GPU presence in the system on the CPU performance. [2019-P-5] Investigate processor replaying architectures and selective flushing The vast majority of high-end application processors are now speculatively executing instructions to reach their high level of performance. One of the underlying issues is that mis-speculation cases are very common, and the cost of recovery can badly impact the benefits of good speculation. The goal of this internship is to study, implement and evaluate state-of-the-art replaying architecture / selective pipeline flush techniques. After a short bibliography study, you will select the most promising solution(s) based on their costs/benefits ratio and implement it in our in-house CPU performance model. You will then validate your implementation with a real-case study, assessing the performance improvements of the chosen solution(s) on the latest application processor currently under development in the centre. [2019-P-6] Enable CPU performance investigations at the system level The gem5 simulator is an open source modular platform used for computer-system architecture research and performance analysis. The goal of this internship is to integrate our existing in-house CPU performance model in the Gem5 environment and evaluate the impact on the overall system performance. After an initial ramp-up phase discovering the in-house CPU model and gem5 environments, you will complete the CPU integration at different levels in the gem5 system.  You will have the opportunity to learn about advanced memory systems and their associated performance optimizations (caches, prefetchers, …), and face the challenges of supporting efficient multiprocessor environments. Once available, you will run various experiments and simulations, and assess the impact of the CPU integration in those different scenarios. [2019-P-7] Improve the efficiency of CPU performance analysis with data science Benchmarking and performance analysis are part of the most significant activities that drive the development of high-end processors. The objective of this internship is to develop a new methodology that could faster point design engineers to areas where performance gains could be expected. You will first get familiar with an existing CPU design, with performance counter architecture and with the state-of-the-art performance analysis infrastructure. You will then develop a new structured, data-science based infrastructure to point engineers faster to areas identified as ‘easy wins’ (time-to-invest vs performance gain), or design areas identified as performance-limiting-factor, for given industry standard benchmarks and CPU configurations. Finally, you will deploy the solution to other Arm processors, which may allow for an accelerated, targeted and quantifiable performance optimisation across design areas and benchmark scores. Benefits and packageSalary - €1500 per monthfree car parkingluncheon vouchers & Public Transport Pass reductionteam and social events
« Retourner aux catégories - Go back to category

Recent jobs at Arm

Recommandez cet emploi à un ami - Recommend to a friend
Date de l'annonce 13-11-2018
Vu: 23 fois