Patriot_missile_failure_binary_options

Patriot_missile_failure_binary_options

Estrategias comerciales técnicas de Candlestick pdf
Hago dinero negociando forex
Tassazione stock options Estados Unidos


Bisnis forex trading menurut islam Binary_options_fm Binary_options_robot_software Señal de Forex 30 ver 2011 Cuándo comprar usando el promedio móvil Td ameritrade forex mínimo

Jan 29th, 2011 11:05 pm rsaquo Gktrk Sakini darr Saltar a los comentarios Bebekleri, ocuklar olanlar iyi bilirler bu bakc hikayesi ok fena bir itir. Birisini bulursun bteye uygun iki gn sonora kuru fazlaya kaar, bakasn bulursun herey ok gzeldir ama evdeki parada veya eyada bir eksiklik hissedersin en el rahat y el vedalarsn8230 bu rnekleri arttrmak ok kolaydr yaayanlar bilir. Benim byle problemim olmad ama daha en aylk bebek ile en bakc deitiren arkadalarm var. Buite uyum ok nemli, yani sadece para deil, sadece yapaca i deil, sadece octuk ile iletiimi deil, sadece anne ve baba ile anlamas oturup kalkmay bilmesi deil hepsi bir btn ve bu btn iinde uyumu yakalamak ok nemli. Ben de almaya biraz daha fazla vakit ayracamdan dolay bakcmzn saatlerini arttrmak istiyorum kendisi ile konutuumda anlaamayz belki diye Kendime bir B yaratmaya plan de alyorum. Bunlar dnrken eitli yerlerde duyurularn grdm MSS danmanlk ile tanmak istedim ve geen hafta ziyaret ederek tantm. Dorusu beklentimin bien STNDE bir profesyonellik ile karlatm8230 Bir Kere irketin sahibi ve bizzat yneticisi Serkan Sorgu ili Terakki, Boazii niversitesi gibi sekin eitim kurumlarnda eitim GRM, hijo sonra derece nemli kurumlarda nemli Iler baarm bir insan. Nasl es decir, el karar verdiini ise kendi szleri aktaryorum 8220Bu sektre girerken, bir e ya da baba gzyle kendi ailemizin ve yakn dost evremizin ihtiyalarndan yola ktm. Bu ihtiyalar karlayan ajanslarn salkl ve kalc zmlere, ya sea bekentiyle gerek anlamda rtecek alan profiline ulamaktaki baar oranlarnn, arzu edilenden daha dk olduunu gzlemledim. Hem 7 senedir yaamakta olduum Gktrk blgesi sakinlerine, el dobladillo de stanbul8217un farkl noktalarna Dalm benzer sosyal stat ve ihtiyalara sahip ailelere hizmet vermek ve onlar IIN 8220kalc8221 zmler retmek zere MSS danmal kurdum.8221 Serkan Sorgu - MSS Danmanlk Hereyden NCE ocuum ve ailem IIN kariyer deitiren bir Anne olarak bir babann da byir bir giriimde bulunmasn takdir ediyorum. Yatl bakc, bebek hemiresi, temizlik eleman bulma ve yerletirme konusunda danmanlk hizmeti veren MSS aile ihtiyalarn detaylca analiz ederek varan olan aday havuzundan uygun olan adaylar belirleyip aile ile tantryorlar. Aday havuzu ise referentslar kontrol edilerek titizlikle seiliyor. Aday ile belirli sre geirip telefonunun ne kadar aldn dahi kontrol ediyorlarm8230 Konumamzda benim de bakc konusunda ufkum geniledi dorusu mesela sadece bebek hemiresi diye bir konsept varm, 0-1 ya aras bakm gibi bu bakclar 1 ya sonrasna bakmyorlarm bazlar sadece 3 yandan sonra bakyormu8230 ocuk geliimi Mezunlar farkl yaklayormu. Aylk 1000 liradan 4000 liraya kadar bakclar varm. Tabii iin profesionelinden dinleyince insan hak veriyor dorusu. Ben buradan tm arkadalarma ha annelere sesleniyorum aslnda bize ucuz gibi grnen alternatifler belki de octukarmzn geleceinde pahalya mal olacak etkiler yaratabilirler, bunu dnerek en azndan bir gidip Serkan Bey ile tann derim hem hepimizin bir B planna ihtiyac var deil mi MSS Danmanla ulamak iin tel. 0212 322 92 39/0530 315 78 35 sitios web kariyerimonlineJueves, 13 de marzo de 2014 Michael Barr Durante la Guerra del Golfo. Veintiocho soldados estadounidenses murieron y casi cien otros resultaron heridos cuando un sistema de defensa antimisiles Patriot no logró rastrear adecuadamente un misil Scud lanzado desde Irak. La causa del fallo fue más tarde encontrada como un error de programación en el equipo incrustado en el sistema de control de armas Patriot8217s. El 25 de febrero de 1991, Iraq lanzó con éxito un misil de Scud que golpeó un cuartel del ejército de los E. cerca de Dhahran, la Arabia Saudita. Las 28 muertes por ese Scud constituyeron el incidente más mortífero de la guerra, para los soldados estadounidenses. Curiosamente, el 8220Dhahran Scud8221, que mató a más personas que los 70 o más de los primeros lanzamientos de Scud, fue aparentemente el último Scud disparado en la Guerra del Golfo. Desafortunadamente, el 8220Dhahran Scud8221 tuvo éxito donde los otros Scuds fallaron debido a un defecto en el software incrustado en el sistema de defensa antimisiles Patriot. Este mismo error estaba latente en todos los Patriots desplegados en la región. Sin embargo, la presencia del error fue enmascarada por el hecho de que una computadora particular del control de armas del patriota tenía que estar funcionando continuamente por varios días antes de que el insecto pudiera causar el peligro de un fallo de seguir un Scud. En el informe oficial de análisis post-fallo de la Oficina General de Contabilidad de los Estados Unidos (GAO IMTEC-92-26), titulado 8220Patriot, se presenta una breve reseña concisa del problema, con un antecedente previo sobre el funcionamiento del sistema Patriot Defensa de misiles: problema de software Dirigido a fallo del sistema en Dhahran, Arabia Saudita 8220. la explicación es que la retrospección: un problema de software 8220led a un cálculo incorrecto de seguimiento que se hizo peor cuanto mayor sea el operated8221 sistema y establece que 8220at el momento del incidente, el Patriot había estado operando continuamente por más de 100 horas 8221, tiempo en el que 8220 la inexactitud era lo suficientemente grave como para hacer que el sistema buscar en el lugar equivocado en los datos de radar para la entrada Scud.8221 El GAO informe no entrar en los detalles técnicos de la específica Error de programación. Sin embargo, creo que podemos inferir lo siguiente sobre la base de la información y los datos que se proporcionan sobre el incidente y sobre el defecto. Una primera observación importante es que la CPU era una CPU de 24 bits entero-solamente 8220 basada en un diseño de los años 708221. Conveniente con el tiempo, el código fue escrito en lenguaje ensamblador. Una segunda observación importante es que los números reales (es decir, aquellos con fracciones) aparentemente fueron manipulados como un número entero en binario en un registro de 24 bits más una fracción binaria en un segundo registro de 24 bits. En este sistema numérico de punto fijo, el número real 3.25 sería representado como binario 000000000000000000000011: 010000000000000000000000. en el que la. Es mi marcador para el separador entre las porciones enteras y fraccionarias del número real. La primera mitad de ese binario representa el número entero 3 (es decir, los bits se establecen para 2 y 1, cuya suma es 3). La segunda porción representa la fracción 0,25 (es decir, 0/2 1/4 0/8 8230). Una tercera observación importante es que el tiempo de funcionamiento del sistema fue controlado continuamente por el reloj interno del sistema en décimas de segundos expresado como un número entero.8221 Esto es importante porque la fracción 1/10 no puede ser perfectamente representada en 24 bits de la fracción binaria porque su expansión binaria , Como una serie de 1 o 0 sobre 2n bits, no termina. Entiendo que el algoritmo de intercepción de misiles que no funcionó ese día es aproximadamente el siguiente: Considere cada objeto que podría ser un misil Scud en los datos de barrido de radar en 3-D. Para cada uno, calcule una próxima ubicación esperada a la velocidad conocida de un Scud (/ - una ventana aceptable). Compruebe los datos de barrido de radar de nuevo en un momento futuro para ver si el objeto está en la ubicación de un Scud sería. Si es un Scud, enganchar y disparar misiles. Además, el GAO informa que el problema fue un error lineal acumulado de 0,003433 segundos por 1 hora de tiempo de actividad que afectó a todos los desplegados Patriot por igual. Esto no era un problema específico del reloj o específico del sistema. Teniendo en cuenta todo lo anterior, razono que el problema era que una parte de los cálculos de interceptación Scud utilizó el tiempo en su representación decimal y otra utilizó la representación binaria de punto fijo. Cuando el tiempo de actividad era todavía bajo, los objetivos se encontraron en los lugares previstos cuando se suponía que estaban y el error de software latente se ocultaba. Por supuesto, todo el detalle antedicho es específico al hardware y al diseño del software de Patriot que estaba en uso en el momento de la guerra del Golfo. Como el sistema Patriot ha sido modernizado por Raytheon, muchos detalles como estos probablemente han cambiado. Según el informe de la GAO: Funcionarios del Ejército creían que la experiencia israelí era atípica y que otros usuarios de Patriot no estaban ejecutando sus sistemas por 8 o más horas a la vez. Sin embargo, después de analizar los datos israelíes y confirmar algunas pérdidas en la exactitud de la focalización, los funcionarios realizaron un cambio de software que compensó el cálculo de tiempo impreciso. Este cambio permitió tiempos de ejecución extendidos y fue incluido en la versión de software modificada que fue lanzada 9 días antes del incidente de Dhahran Scud. Sin embargo, los oficiales del Ejército no utilizaron los datos israelíes para determinar cuánto tiempo podría el Patriot operar antes de que el cálculo impreciso del tiempo hiciera que el sistema fuera ineficaz. Cuatro días antes del ataque mortal Scud, la Oficina de Proyectos 8220Patriot en Huntsville, Alabama envió un mensaje a los usuarios de patriotas que indican que los tiempos de ejecución muy largos podrían causar problems.8221 de orientación que había estado sobre la hora del último reinicio del misil Patriot que ha fallado. Tenga en cuenta que si las muestras de tiempo estuvieran todas en la base de tiempo decimal o todas en la base de tiempo binaria entonces las dos muestras de radar comparadas estarían siempre cerca en el tiempo y el error no se acumularía con el tiempo de actividad. Y esa es la probable solución que se implementó. Aquí hay algunas cositas interesantes tangencialmente desde el informe de la GAO: 8220During la Guerra del Golfo se modificó el software Patriot8217s seis times.8221 8220Patriots tenían que ser cerrado durante por lo menos 1 a 2 horas para instalar cada software modification.8221 8220Rebooting toma alrededor de 60 a 90 segundos8221 y vuelve a cero el tiempo 8220. El 8220 software actualizado, que compensó el cálculo impreciso del tiempo, llegó a Dhahran8221 al día siguiente del ataque mortal. En retrospectiva, hay algunas cotizaciones dignas de mención de los artículos de noticias de 1991 que inicialmente informaban sobre este incidente. Por ejemplo, el Brig. General Neal, Comando de los Estados Unidos (2 días después): El Scud aparentemente se fragmentó por encima de la atmósfera, luego cayó hacia abajo. Su ojiva hizo estallar un cráter de ocho pies de ancho en el centro del edificio, que está a tres millas de una importante base aérea de los Estados Unidos. 8230 Nuestra investigación parece que este misil se rompió en vuelo. En este misil en particular no estaba en los parámetros de donde podría ser atacado. El incidente fue una anomalía que nunca apareció en miles de horas de pruebas e involucró una combinación imprevista de docenas de variables 8212 incluyendo la velocidad, la altitud y la trayectoria de Scud8217s. Es importante destacar que el informe de la GAO indica que, pocas semanas antes del Scud Dharan, los soldados israelíes informaron al Ejército de Estados Unidos que su Patriota tenía una notable pérdida de precisión después de 8230 8 horas consecutivas.8221 Así pues, aparentemente, todas estas 8220 miles de horas Las pruebas implicaron reinicios frecuentes. (Puedo imaginar la documentación de prueba ahora: 8220 Paso 1: Encender el Patriot Paso 2: Comprobar que todo es perfecto Paso 3: Disparar el objetivo ficticio. 8221) La GAO informó que 8220an prueba de resistencia se ha llevado a cabo para asegurar que Los tiempos de ejecución extendidos no causan otras dificultades del sistema.8221 Obsérvese también que la cotización de 8220 miles de horas de pruebas8221 también era engañosa, ya que el software Patriot se modificó apresuradamente en los meses previos a la Guerra del Golfo Para realizar un seguimiento de los misiles Scud aproximadamente 2,5 veces más rápido que el avión y los misiles de crucero que originalmente estaba diseñado para interceptar. Las mejoras de los algoritmos de rastreo / compromiso específicos de Scud aparentemente estaban siendo hechas durante la Guerra del Golfo. Estas teorías específicas y las declaraciones acerca de salió mal o por qué debe haber sido un problema fuera de la Patriota en sí fueron totalmente desacreditados una vez que el código fuente fue examinado. Cuando los sistemas informáticos pueden haberse comportado mal de una manera letal, es importante recordar que las citas de periódicos de los que están al lado de los diseñadores no son evidencia científica. De hecho, los seres humanos que ofrecen esas cotizaciones a menudo tienen motivos conscientes y / o subconscientes y puntos ciegos que los favorecen a estar falsamente confiados en los sistemas informáticos. Una revisión exhaustiva del código fuente lleva tiempo, pero es la manera científica de encontrar la causa raíz. El Pentágono explicó inicialmente que las baterías del Patriot habían retenido su fuego en la creencia que el Scud mortal de Dhahran8217s había roto en midflight. Sólo ahora comienza a surgir la verdad sobre la tragedia: un fallo en el software de un ordenador cerró el sistema de radar de Patriot8217s, cegando las baterías antimisiles Dhahran8217s. El Pentágono perpetuó su ficción. Al menos en este caso, fueron sólo unos meses antes de que el Ejército de los EE. UU. Admitiera la verdad acerca de lo que sucedió a sí mismos y al público. Eso es para el crédito de los Estados Unidos. Otros actores de otros casos letales de defectos de software han sido mucho más obstinados en admitir lo que más tarde se ha hecho claro acerca de sus sistemas. Lunes 3 de marzo de 2014 Michael Barr Si los programadores de Apple simplemente siguieran un par de las reglas del Embedded C Coding Standard. Podrían haber evitado que el muy serio error de SSL Gotofail ingresara en los sistemas operativos iOS y OS X. Aquí veremos los errores de programación involucrados y las reglas estándar de codificación fáciles de seguir que podrían evitar fácilmente el error. En caso de que no haya estado siguiendo las noticias de seguridad informática, Apple publicó la semana pasada actualizaciones de seguridad para usuarios de dispositivos que ejecutan iOS 6, iOS 7 y OS X 10.9 (Mavericks). Esto fue provocado por un fallo crítico en la implementación de Apple8217s del protocolo SSL / TLS, que aparentemente ha estado al acecho durante más de un año. En pocas palabras, el error es que un montón de líneas de código fuente C importantes que contienen cheques de certificados de firma digital nunca se ejecuta porque una declaración de fallo goto extra en una parte del código siempre forzaba un salto. Este es un error que pone a millones de personas en todo el mundo en riesgo de ataques de hombre en el medio en sus conexiones encriptadas aparentemente seguras. Por otra parte, Apple debe ser avergonzado que este error en particular también representa un claro fracaso del proceso de software en Apple. Existe un debate acerca de si esto puede haber sido un inteligente ataque de seguridad habilitado por iniciados contra todos los usuarios de Apple8217s, p. Por un cierto organismo gubernamental. Sin embargo, si se trataba de un error inocente o un ataque diseñado para parecerse a un error inocente, Apple podría haber y debe haber evitado este error escribiendo la parte pertinente de código de una manera sencilla que siempre habría sido más fiable, así como más seguro. Y así, en mi opinión, Apple era claramente negligente. Aquí están las líneas de código en cuestión (desde el servidor de código fuente abierto de Apple8217s), con el extra goto en negrita: El código anterior viola al menos dos reglas de Barr Group 8216s Embedded C Coding Standard book. Es importante destacar que si Apple hubiese seguido al menos la primera de estas reglas, en particular, este peligroso error casi con toda seguridad habrían sido evitado de entrar en un solo dispositivo. Las llaves siempre rodearán los bloques de código (a.k.a. declaraciones compuestas), siguiendo if, else, switch, while, do, y para sentencias, las sentencias simples y las sentencias vacías que siguen estas palabras clave también estarán siempre rodeadas de llaves. Si Apple no hubiese violado esta regla de siempre-llaves en el código SSL / TLS anterior, habría habido o sólo un juego de llaves después de cada prueba si o un extraño aspecto difícil de detectar pieza de código con dos conjuntos de rizado Abrazaderas después del si con dos gotos. De cualquier manera, este error se pudo evitar siguiendo esta regla y realizando la revisión de código. No se utilizará la palabra clave goto. Si Apple no hubiese violado esta regla nunca-goto en el código SSL / TLS anterior, no habría habido una línea de error doble goto para crear la situación de código inalcanzable. Ciertamente, si eso obligaba a cada una de las líneas goto a ser reemplazada por más de una línea de código, habría obligado a los programadores a usar rizadores. En una nota final, Apple debería estar preguntando a sus ingenieros y gerentes de ingeniería sobre los fallos de proceso (en varias capas) que deben haber ocurrido para que este error haya entrado en los dispositivos del usuario final. Específicamente: ¿Dónde estaba la revisión del código de pares que debería haber detectado esto o cómo los revisores no pudieron detectar esto? ¿Por qué wasn8217t una norma estándar de codificación adoptada para hacer estos errores más fáciles de detectar durante las revisiones de código de peer ¿Por qué wasn8217t una herramienta de análisis estático, Klocwork. Utilizado o cómo no pudo detectar el código inalcanzable que siguió ¿O eran los usuarios de tal herramienta, en Apple, que falló en actuar ¿Dónde estaba el caso de prueba de regresión para una firma de certificado SSL defectuoso, o cómo falló la prueba? Bugs, como éste de Apple, a menudo resultan de una combinación de errores acumulados en la cara de procesos de desarrollo de software defectuosos. Muy pocos programadores reconocen que muchos errores pueden mantenerse completamente fuera de un sistema simplemente adoptando (y aplicando rigurosamente) un estándar de codificación que está diseñado para mantener los errores fuera. Sábado, 26 de octubre de 2013 Michael Barr A principios de 2011, escribí un par de artículos de blog (aquí y aquí), así como un artículo posterior (aquí) describiendo mis pensamientos iniciales sobre skimming el informe oficial de NASA8217s sobre su análisis de Toyota8217s electronic throttle control sistema. Medio año después, fui contactado y contratado por abogados para las numerosas partes involucradas en demandar a Toyota por lesiones personales y pérdidas económicas derivadas de incidentes de aceleración involuntaria. Como resultado, conseguí mirar el código fuente del motor Toyota8217s directamente y juzgar por mí mismo. A partir de enero de 2012, I8217ve lideró un equipo de siete ingenieros experimentados, incluyendo otros tres del Grupo Barr. Al revisar el acelerador electrónico Toyota8217s y algún otro código fuente, así como documentos relacionados, en un cuarto seguro cerca de mi casa en Maryland. Este trabajo se llevó a cabo en dos rondas, con una primera ronda de informes de expertos y declaraciones emitidas en julio de 2012 que llevó a un acuerdo de pérdidas económicas de mil millones de dólares, así como un acuerdo no revelado del primer caso de lesiones personales establecido para juicio en el Tribunal Federal de los EE.UU. La segunda ronda comenzó con un informe escrito de expertos de más de 750 páginas por mí en abril de 2013 y culminó esta semana en una decisión del jury8217s de Oklahoma que los múltiples defectos en el software del motor Toyota8217s causó directamente un accidente de un solo vehículo de septiembre 2007 que hirió al conductor y la mató pasajero. Es significativo que este fue el primer y único jurado hasta ahora en escuchar cualquier opinión sobre los defectos de software de Toyota8217s. Los casos anteriores anteriores a nuestro acceso al código fuente, aplicado una teoría no-software, o fue resuelto por Toyota por una suma no revelada. En nuestro análisis del código fuente de Toyota8217s, nos basamos en el análisis previo de la NASA. Primero, miramos más de cerca más líneas del código de fuente para más vehículos para más meses del hombre. Y también hicimos un montón de cosas que la NASA no tenía tiempo para hacer, incluyendo la revisión de los sistemas internos del sistema operativo 8282, revisando el código fuente de la 8282monitor CPU8222 de Toyota8217s, realizando un análisis de profundidad de pila independiente, ejecutando porciones del software principal de la CPU incluyendo El RTOS en un simulador de procesador, y demostrando 8211 en 2005 y 2008 vehículos de Camry de Toyota 8211a enlace entre la pérdida de control del acelerador y los numerosos defectos que encontramos en el software. En pocas palabras, el equipo liderado por Barr Group encontró lo que buscaba el equipo de la NASA, pero no pudo encontrar: 8220a un mal funcionamiento sistemático del software en la CPU principal que abre el acelerador sin acción del operador y continúa controlando correctamente la inyección de combustible 8221 que no es detectada de manera fiable por Cualquier falla segura. Para ser claros, la NASA nunca llegó a la conclusión de que el software no era al menos una de las causas de la alta tasa de quejas de Toyota8217s para la aceleración involuntaria que acaba de decir que weren8217t capaz de encontrar el defecto de software específico que causó la aceleración involuntaria. Lo hicimos. Ahora es tu turno de juzgar por ti mismo. Aunque no creo que pueda encontrar mi informe de expertos fuera del sistema de la Corte, aquí hay enlaces a la transcripción de mi testimonio experto al jurado de Oklahoma y una copia (expurgada) de las diapositivas que compartí con el jurado en Bookout, et.al . V. Toyota. Tenga en cuenta que el jurado en Oklahoma encontró que Toyota debía a cada víctima 1,5 millones de dólares en daños compensatorios y también encontró que Toyota actuó con 8220 desprecio irresponsable8221. Esta última norma legal significaba que el jurado se dirigía a deliberaciones sobre daños punitivos adicionales cuando Toyota llamó a los demandantes para que se resolvieran (por otra cantidad no revelada). Se ha informado de que otros 400 casos de lesiones personales siguen funcionando a través de varios tribunales. Toyota continúa denegando públicamente que hay un problema y parece no tener planes para abordar el diseño inseguro y las cajas fuertes inadecuadas de fallo en sus vehículos de la impulsión-por-alambre 8211 el diseño de la electrónica y de software de que es similar en la mayor parte de Toyota y Lexus Scion) fabricados por lo menos durante los últimos diez años modelo. Mientras tanto, los incidentes de aceleración involuntaria siguen siendo reportados en estos vehículos (ver también la base de datos de quejas de NHTSA) y estos nuevos incidentes, cuando las lesiones son graves, continúan dando lugar a nuevas demandas de lesiones personales contra Toyota. En marzo de 2014, el Departamento de Justicia de los Estados Unidos anunció un acuerdo de 1,2 mil millones en una causa penal contra Toyota. Como parte de ese acuerdo, Toyota admitió a mentir pasado a NHTSA, al congreso, y al público sobre la aceleración involuntaria y también poner su marca antes de la seguridad pública. Sin embargo, Toyota todavía no ha hecho ningún retiro de seguridad para el software del motor defectuoso. El 1 de abril de 2014, hice un discurso de apertura en la conferencia de EE Live, que abordó el litigio de Toyota en el contexto de los fallos letales del software incrustado del pasado y la próxima era de vehículos autodirigidos. Las diapositivas de esa presentación están disponibles para su descarga en barrgroup / killer-apps /. El 30 de octubre de 2014, el informático italiano Roberto Bagnara presentó una charla titulada 8220En el caso Toyota UA y la redefinición de la responsabilidad del producto por software embebido 8221 en el 12º taller sobre sistemas de amplificación de software automotriz en Milán. Viernes, 9 de noviembre de 2012 Michael Barr Estimado lector, ha pasado más de seis meses desde mi última entrada en el blog. Mis disculpas por estar ausente sin salir de este blog y de mi boletín electrónico de actualización de firmware. Nunca he estado tan ocupado, profesionalmente, como en los últimos 14 meses. Reconozco que he estado tranquilo durante demasiado tiempo para muchos de ustedes y tenga en cuenta que varios lectores han escrito para preguntar si estoy bien o lo que me está manteniendo tan ocupado. Estoy agradecido por su preocupación y también por su paciencia. Espero que este sea el primero de varios posts que escribiré en las próximas semanas y que reanudaré un ritmo normal en los próximos meses. Tengo un montón de ideas. Además de lanzar la nueva empresa, Barr Group. Y traer a CEO Andrew Girson a principios de este año here8217s un breve resumen de sólo algunos de what8217s sido mantenerme tan ocupado: Toyota Unintended Acceleration. Usted puede estar enterado de la investigación de NHTSA y del informe asociado de la NASA en software. Hace aproximadamente un año, los demandantes fueron retenidos por los demandantes en las reclamaciones consolidadas de daños personales y pérdidas económicas contra Toyota en el Tribunal de Distrito de los Estados Unidos (nota: no estoy involucrado en ninguna de las causas de los tribunales estatales). Me siento honrado de haber tenido la oportunidad de revisar el código fuente del control de motor Toyota8217s con la ayuda de un equipo muy talentoso de Barr Group y otros ingenieros. Hemos sido capaces de empujar el análisis de código fuente más profundo que la NASA y también a través de muchos años más de vehículos y modelos de Smartphone vs Apple (y LG). También he estado trabajando como un testigo experto en las guerras de teléfonos inteligentes. En este asunto, el cliente de Barr Group8217s, Smartphone, es el titular de varias patentes concedidas originalmente a la ya desaparecida Palm, que añadió por primera vez capacidades de teléfonos celulares a sus populares productos portátiles PDA. Mi equipo ha estado trabajando el lado de la infracción de esta disputa de patentes, lo que me ha obligado a revisar Apple8217s iOS código fuente, así como LG8217s código fuente de Android. Madden Fútbol. Otro cliente es el autor original de los populares juegos de fútbol Madden para Apple II, Commodore 64 y IBM PC. Él está demandando a la editorial del juego (una pequeña compañía llamada Electronic Arts) por incumplimiento de contrato y regalías pasadas. En pocas palabras, la cuestión es si el movimiento de los primeros PC game8217s código de las consolas como Sega Genesis y Super Nintendo era sala limpia o un puerto. Revisar tanto el código de ensamblaje de las CPU de 8 y 16 bits de décadas de antigüedad me ha recordado lo maravilloso que es programar incluso en el lenguaje de alto nivel 8221 de C. Printers and Set-Top Boxes. A menos que lo anterior le da la impresión de que sólo trabajo con los demandantes, en este tiempo también he estado ayudando: Samsung defender contra las acusaciones de que abusó de un ex socio8217s software en sus impresoras Motorola Mobility (ahora Google) - las cajas de la parada infringen un par de patentes de Microsoft y una compañía canadiense de la TV vía satélite se defiende de las acusaciones que permitió que su servicio fuera pirateado en detrimento de un negocio rival de la compañía de la TV por cable. Aunque nunca esperé o planeé trabajar con tantos abogados cuando me especialicé en ingeniería eléctrica y practiqué software embebido, me gusta mucho trabajar como un testigo experto. Por una parte, disfruto de la gama requerida de tener que entender las ediciones técnicas así como encontrar maneras de explicarlas a los públicos menos técnicos, incluyendo a jueces y jurados. Por otro lado, leer tanto código fuente escrito por otros y hacer formas relacionadas de ingeniería inversa ha seguido informando mi opinión sobre las mejores prácticas en procesos y arquitectura de software embebido. A medida que estos y otros casos disminuyan en los próximos meses y años, espero poder compartir algunas de estas lecciones con ustedes en este blog y en mi otro trabajo como consultor y entrenador. Tuesday, March 13th, 2012 Michael Barr En esta era de 140 caracteres o menos, se ha dicho bien y concisamente que 8220RELIABILITY se refiere a errores ACCIDENTALES que causan fallos, mientras que SECURITY se refiere a errores INTENCIONALES que causan fallos.8221 En esta columna, , Especialmente en lo que se refiere al diseño de sistemas embebidos y su lugar en nuestro mundo moderno conectado a la red y con seguridad. Como diseñadores de sistemas integrados, lo primero que debemos lograr en cualquier proyecto es hacer que el hardware y el software funcionen. Es decir, necesitamos hacer que el sistema se comporte como estaba diseñado. La primera iteración de esto es a menudo escamosa ciertos usos o perturbaciones del sistema por los probadores pueden fácilmente desalojar el sistema en un estado de no-trabajo. En el lenguaje común, 8220expect bugs.8221 Dado el tiempo, los ciclos de ajuste de la depuración y la prueba pueden hacernos pasar los errores ya través de un producto que se puede enviar. Pero un sistema depurado es lo suficientemente bueno Ni la fiabilidad ni la seguridad pueden probarse en un producto. Cada uno debe estar diseñado desde el principio. Así que vamos a echar un vistazo más de cerca a estos dos aspectos importantes del diseño de los sistemas embebidos modernos y luego volver a reunirlos al final. Sistemas confiables incorporados Un producto puede ser estable pero carece de confiabilidad. Considere, por ejemplo, un ordenador de frenado antibloqueo instalado en un automóvil. El software de los frenos antibloqueo puede estar libre de errores, pero ¿cómo funciona si falla un sensor de entrada crítico? Los sistemas confiables son robustos frente a entornos de tiempo de ejecución adversos. Los sistemas confiables son capaces de evitar los errores encontrados cuando se producen en el sistema en el campo de manera que el número y el impacto de los fallos se minimicen. Una estrategia clave para la construcción de sistemas confiables es la eliminación de puntos únicos de falla. Por ejemplo, podría añadirse redundancia alrededor de ese sensor de entrada crítico 8211 tal vez añadiendo un segundo sensor en paralelo con el primero. Otro aspecto de la fiabilidad que está bajo el control total de los diseñadores (al menos cuando lo consideran desde el principio) son los mecanismos 8220fail-safe8221. Tal vez una alternativa adecuada pero de menor costo a un sensor redundante es la detección del sensor fallido con una caída de nuevo a frenado mecánico. El Análisis de Modo y Efecto (FMEA) es uno de los procesos de diseño más eficaces e importantes que utilizan los ingenieros para diseñar la confiabilidad en sus sistemas. Siguiendo este proceso, cada posible punto de fallo se rastrea desde el fallo de la raíz hacia afuera hasta sus efectos. En un FMEA, los pesos numéricos se pueden aplicar a las probabilidades de cada falla, así como la gravedad de las consecuencias. Por tanto, un FMEA puede ayudarlo a diseñar un diseño rentable pero de mayor fiabilidad al destacar los lugares más valiosos para insertar la redundancia, las cajas a prueba de fallos u otros elementos que refuercen la confiabilidad general del sistema. En ciertas industrias, la fiabilidad es un factor clave en la seguridad de los productos. Y es por eso que ven estas técnicas y FMEA y otros diseños para procesos de confiabilidad que están siendo aplicados por los diseñadores de sistemas críticos de seguridad automotriz, médica, aviónica, nuclear e industrial. Por supuesto, las mismas técnicas pueden utilizarse para hacer que cualquier tipo de sistema embebido sea más fiable. Independientemente de su industria, suele ser difícil o imposible hacer su producto tan fiable a través de parches. No hay manera de agregar hardware como el sensor redundante, por lo que sus opciones pueden reducir a un seguro a prueba de fallos que es útil, pero menos fiable en general. La fiabilidad no puede ser parcheado o probado o depurado en su sistema. Más bien, la fiabilidad debe diseñarse desde el principio. Secure Embedded Systems Un producto también puede ser estable pero carece de seguridad. Por ejemplo, una impresora de oficina es el tipo de producto que la mayoría de nosotros compra y usa sin dar un minuto de pensamiento a la seguridad. El software de la impresora puede estar libre de errores, pero es capaz de evitar que un posible intruso capture una copia electrónica remota de todo lo que imprime, incluidos sus documentos financieros sensibles. Los sistemas seguros son robustos ante un ataque persistente. Los sistemas seguros son capaces de mantener a los hackers por diseño. Una estrategia clave para construir sistemas seguros es validar todas las entradas, especialmente aquellas que llegan a través de una conexión de red abierta. Por ejemplo, se podría agregar seguridad a una impresora asegurándose contra los desbordamientos de búfer y cifrando y firmando firmware digitalmente. Uno de los hechos desafortunados de diseñar sistemas embebidos seguros es que los piratas informáticos que desean conseguir adentro necesitan solamente encontrar y explotar una sola debilidad. La adición de capas de seguridad es buena, pero si incluso cualquiera de esas capas permanece fundamentalmente débil, un atacante suficientemente motivado finalmente encontrará y violará esa defensa. Pero eso no es una excusa para no intentarlo. Durante años, el fabricante de impresoras más grande del mundo aparentemente pensó poco en la seguridad del firmware en sus impresoras domésticas / de oficina, aun cuando estaba poniendo a decenas de millones de objetivos tentadores en el mundo. Ahora la seguridad de esas impresoras ha sido violada por los investigadores de seguridad con una conciencia razonable del diseño de sistemas embebidos. Dicho uno de los investigadores principales, 8220 podemos modificar el firmware de la impresora como parte de un documento legítimo. Representa correctamente, y al final del trabajo hay una actualización de firmware. 8230 En un entorno super seguro donde hay un cortafuegos y no hay acceso al gobierno, Wall Street 8212 podría enviar un rsum para imprimir.8221 La seguridad es un valiente nuevo mundo para muchos diseñadores de sistemas embebidos. Durante décadas hemos confiado en el hecho de que los microcontroladores y la memoria Flash y los sistemas operativos en tiempo real y otras tecnologías menos comunes que usamos protegerán nuestros productos contra los ataques. O que podemos obtener suficiente seguridad por obscuridad, manteniendo nuestros protocolos de comunicaciones y procesos de actualización de firmware en secreto. Pero ya no vivimos en ese mundo. Debes adaptarte. Demasiado a menudo, la capacidad de actualizar un firmware del producto 8217s en el campo es el vector mismo que 8217s utilizó para atacar. Esto puede ocurrir incluso cuando un objetivo principal para incluir actualizaciones de firmware remoto está motivado por la seguridad. Por ejemplo, como he aprendido en mi trabajo como experto en numerosos casos de ingeniería inversa de las técnicas y la tecnología de piratería de televisión por satélite, gran parte de esa piratería ha sido potenciada por el mismo mecanismo de parcheo de software que permitía a las emisoras realizar actualizaciones de seguridad Y contramedidas electrónicas. Irónicamente, si las tarjetas inteligentes de seguridad en esos decodificadores sólo hubieran enmascarado imágenes ROM, la seguridad general del sistema podría haber sido mayor. Esto no fue ciertamente lo que los diseñadores del sistema tenían en mente. Pero la seguridad también es una carrera armamentista. Al igual que la fiabilidad, la seguridad debe diseñarse desde el principio. La seguridad puede ser revisada o probada o depurada. Simplemente puede agregar seguridad tan eficazmente una vez que el producto se envía. For example, an attacker who wished to exploit a current weakness in your office printer or smart card might download his hack software into your device and write-protect his sectors of the flash today so that his code could remain resident even as you applied security patches. Reliable and Secure Embedded Systems It is important to note at this point that reliable systems are inherently more secure. And that, vice versa, secure systems are inherently more reliable. So, although, design for reliability and design for security will often individually yield different results8211there is also an overlap between them. An investment in reliability, for example, generally pays off in security. Why Well, because a more reliable system is more robust in its handling of all errors, whether they are accidental or intentional. An anti-lock braking system with a fall back to mechanical braking for increased reliability is also more secure against an attack against that critical hardware input sensor. Similarly, those printers wouldn8217t be at risk of fuser-induced fire in the case of a security breach if they were never at risk of fire in the case of any misbehavior of the software. Consider, importantly, that one of the first things a hacker intent on breaching the security of your embedded device might do is to perform a (mental, at least) fault tree analysis of your system. This attacker would then target her time, talents, and other resources at one or more single points of failure she considers most likely to fail in a useful way. Because a fault tree analysis starts from the general goal and works inward deductively toward the identification of one or more choke points that might produce the desired erroneous outcome, attention paid to increasing reliability such as via FMEA usually reduces choke points and makes the attacker8217s job considerably more difficult. Where security can break down even in a reliable system is where the possibility of an attacker8217s intentionally induced failure is ignored in the FMEA weighting and thus possible layers of protection are omitted. Similarly, an investment in security may pay off in greater reliability8211even without a directed focus on reliability. For example, if you secure your firmware upgrade process to accept only encrypted and digitally signed binary images you8217ll be adding a layer of protection against an inadvertently corrupted binary causing an accidental error and product failure. Anything you do to improve the security of communications (i.e. checksums, prevention of buffer overflows, etc.) can have a similar effect on reliability. The Only Way Forward Each year it becomes increasingly important for all of us in the embedded systems design community to learn to design reliable and secure products. If you don8217t, it might be your product making the wrong kind of headlines and your source code and design documents being poured over by lawyers. It is no longer acceptable to stick your head in the sand on these issues.Key Learnings from Past Safety-Critical System Failures Tue, 2014-09-09 13:00 - Dan Smith Ever had a DVD player freeze A mobile phone crash and reboot A home router that required a reset Welcome to the 21st century, where every device has at least one processor. Without question, our daily lives are enhanced by the embedded systems around us. And many of us take for granted that these devices will do their intended jobs tirelessly and correctly, day in and day out, without fail. But, not all embedded systems are blenders and DVD players. Just today, your life was in the hands of multiple embedded systems. From traffic light controls and automotive controls, to life-saving medical devices, and avionics and energy production, embedded systems are entrusted to perform correctly and keep us safe. Yet, many safety-critical devices do not operate correctly 100 of the time. Here we examine some of the more notable firmware failures, describing the products, the defects, the root causes and what could have been done better. Well discuss what weve learned, where we are today, and what the future may hold. Weve all heard the quote, Those who cannot remember the past are condemned to repeat it. So, why is it that when it comes to safety- or mission-critical firmware, we seem to have amnesia The remainder of this page is a transcript of a 1-hour webinar. A recording of the webinar is available at vimeo/105886749. Slide 1: Safety-Critical Firmware: What can we learn from past failures Good afternoon and thank you for attending Barr Groups Webinar on Safety Critical Firmware: What can we learn from past failures My Name is Jennifer and I will be moderating todays Webinar. Our presenters today will be Michael Barr Chief Technical Officer at Barr Group and Dan Smith Principal Engineer of Bar Group. Todays presentation will last for approximately 40 minutes after which there will be a monitored question and answer session. To submit a question during the event, type your question into the rectangular space at the bottom right and click the send button. Questions will be addressed at the conclusion of the presentation. At the start of the webinar check to be sure that your audio speakers are on and then the volume is turned up. Please make sure to close out all background programs and turn off anything that might affect your audio feed during the webinar. I am pleased to present Michael Barr and Dan Smiths Webinar. Safety-Critical Firmware What can we learn from past failures Slide 2: Michael Barr, CTO Thank you and welcome everyone. Im Michael Barr the Chief Technical Officer of the Barr Group. Slide 3: Barr Group - The Embedded Systems Experts Our company helps other companies make their embedded systems safer and more secure. What I mean by that is that we dont make a products of our own rather we help client companies make medical devices and industrial controls and consumer electronics and other types of products and we do that in various different ways. We do actual product development work were we take on some or all of the mechanical, electrical and software design to assists companies. And we also provide engineering guidance and consulting to engineering managers and directors and vice-presidents who are interested in leading change in their organizations to improve the process or to re-architect or architect the embedded systems and embedded software always with the focus on making them, ultimately more reliable systems, safer and more secure. And in addition to that we also have a mission of training in part of that is motivated to our desire to improve as many embedded systems and make as many embedded systems safer more reliable as possible. We deliver that training both in private settings at companies and also publicly such as at our upcoming public boot camps. Slides 4-5: Upcoming Public Courses Before we get to todays course, I just want to alert you that we regularly do trainings both at private companies and you can see on our website a list of all the courses that we offer. And we also have public trainings including some upcoming boot camps. These are our weeklong, four and a half day, hands-on, intensive software training courses. The course titles, dates, and other details for all of our upcoming public courses are available at the Training Calendar area of our website. Slide 6: Dan Smith, Principal Engineer The final thing I would like to do before get started is to introduce todays speaker. Mr. Dan Smith is a principal engineer at the Barr Group. He has more than twenty years of experience designing embedded hardware and software for a variety of different industries and using a variety of different tools in terms of real-time operating systems processors and development tools, et cetera. But always with a focused on secure and safe systems and with that I introduce Dan and let him take it from here. Gracias. Slide 7: Overview of Todays Webinar Thanks Michael and thank you Jennifer as well. So welcome everybody, thank you for joining us for this webinar today and Id like to give you a quick overview of what were going to be talking about. So what were going to do is take a look at four specific case studies where software had a significant role in the failure of a critical system and were doing that because we want to examine what the root cause was and understand how that problem could have been prevented. As you will see, in retrospect when we look at what the root cause at these problems were, itll seem pretty obvious what could have been done to prevent it but obviously its not as obvious as we would think because otherwise we wouldnt be continuously repeating these problems over and over again. And ultimately its going to turn out that the answer is a combination of education and knowledge, training, process, good software process, and remaining vigilant towards these bugs and not getting lazy or complacent. As far as the content of this webinar its useful if you have a working knowledge of C or C but thats not absolutely necessary. Slide 8: Critical Systems Considering the title of this webinar its seems only appropriate that we should define what we mean by a critical system. A pretty standard definition for a critical system is a system whose malfunction can cause injury or death. We can all relate to that, things like avionics, medical devices things like that. We also want to acknowledge the fact that there are a lot of industries that might not be able considered critical per say, things like high security systems, high availability systems, even mission critical systems that are unmanned that might not cause injury or death for example unmanned underwater exploration or unmanned space exploration but still the cost of failure in a mission like this is very high. So we want to include those on our definition as well. Also I want to mention that there are all sorts of failure modes but generally failures of critical systems, theres one of two things that happens: either the product takes action when it shouldnt or it does the wrong thing or it fails to do the appropriate action at the right time. So for example you might have a medical device that delivers too much radiation or you might have a collision avoidance system that instead of steering away from a collision steers directly into it a collision. That would be something that the product actively did. You can also have failures that are the opposite of that where the product fails to perform its functionality. Things like an airbag failing to deploy when it should or lifesaving medical device failing to deliver the correct therapy at the right time or perhaps the gas detector that doesnt warn when dangerous level of the gas have built up. Slide 9: Looking Through a Keyhole Before we proceed I just want to acknowledge up front that the engineering of critical systems is an extremely rigorous process, its a multi-disciplinary process and were only going to covering a very small slice of the entire development process specifically the firmware engineering part. Even then were going to be talking about the implementation phase. So not firmware architecture, not the design process, not all the testing techniques. Slide 10: Role of Firmware in Critical Systems One of the reasons were focused entirely on firmware in this webinar is the fact that firmware is playing an increasingly important role in critical systems today. Everything from transportation whether its automobiles, locomotives, avionics things like that, mobile electronics, security systems, medical devices which are in charge of sustaining health or even saving lives et cetera. we could go on and on. Im sure we have a very large range of industries represented in our webinar audience today. The reality is that more and more functionality is being pushed in the firmware and as you guys know the more functionality you put in the complexity increases, as the complexity increases so does the potential for bugs. Just think about it many of you probably drove to work today. If you thought about how many times your life was in the hands of a critical system, youd probably never get out of bed in the morning. Just in your vehicle, just think of how many computers, how many processors there are plus you have traffic lights you have other vehicles on the road. There are all sorts of systems that are interacting. Slide 11: Ripped from the Headlines So today were going to be looking at four case studies from the past. But before we jump in to those I very briefly want to mention something that just recently happened. This was late August, 2014. Some of you might have heard about this, Ive included a screen shot from the website along with the URL so you can look at it from a Russian news agency. Many of you might be familiar with Galileo this is Europes global positioning system. Its a 13 Billion dollar thirty satellite network and its being built right now. There are four satellites in orbit. Just recently satellites 5 and 6 were supposed to be put up into their orbits. This was late August as I said and unfortunately due to a problem the satellites where put in a wrong orbit and now most likely there stranded in an unusable orbit and Scientist might be able to do something to Rescue these. It wont be the first time that a rabbit has been pulled out of a hat so to speak when something like this is happened but its also very possible that the satellites will have to be destroyed. What actually happened was, the satellites failed to reach their intended orbit and its believed or at least its been alleged that this was cause by software errors and the Frigate M - T rockets upper stage. So according from the new story, the non-standard operation was likely caused by an error in the embedded software operating in full accordance with the embedded software, the rocket has delivered the units to the wrong destination. So thats a quote from the Russian new agency. Slide 12: Hindsight Finally before we get in to the four cases I do want to say one other thing. As were looking at these case studies many of you will probably say to yourselves How did they let this happened, how could this possibly have escaped in to the field of course in hindsight the problem is obvious and it seems like it should have been caught but the reality is that the same mistakes happen over and over again by smart people. So clearly its not as obvious as it might seem. This is the same thing with security. I spent a lot of time doing security in Barr Group. We do a lot of stuff with security and many of you who work with security know its a small sub set of problems that cause the majority of security breaches, things like buffer overflows. So clearly this is something that continues to plague us. The point of this webinar is not the point the finger and criticizes or taunt or say, ha-ha. We all are capable of making these kinds of mistakes. The point of this webinar is just to educate ourselves and hopefully develop a greater awareness so that we have a less of a likelihood of repeating the same problems in the future. Slide 13: Therac-25 Okay so our first case study were going to talk about the Therac-25, which was a radiation therapy machine obviously for cancer treatment. Its predecessors date to the 1970s. The time frame were talking about now is early to mid-1980s. Without getting into a lot of details about the architecture it was essentially an embedded deck-PDP 11 mini computer. This device this machine generated radiation that was sent to the patient in order to destroy cancerous tumors. It had two modes of operation the first was direct electron beam therapy which uses a low power electron bean whose energy was spread using scanning magnets. The second mode of operation which was a newer mode operation was called megavolt x-ray mode and this delivered much higher energy x-rays to the patient. Slide 14: Therac: What Happened So what happen with the Therac-25 Well to give you a complete discussion on this would take an hour, so Im going to try to cut to the chase and get right to the core defect here first is theres a race condition that allowed the high powered x-ray beam to be activated in on without this beams spreader or diffuser plate in place to distribute the energy. So in other words you had this high-powered x-ray beam going right into the patient and not being diffuse at all to due to a race condition. And theres a second software problem that was responsible for another failure mode which resulted massive radiation overdosing. The software used in eight bit unsigned counter are as parameters were entered into the machine and this was continuously being checked against the prescription. Once the parameter is matched up the counter was reset to zero and the treatment could be started but heres the tragic flaw. Obviously the maximum value of this counter is 255 but what would happen if the software incremented that counter deliberately of course when it holds the value of 255 Well of course its going to wrap around to zero thats well defined and just as we discussed when this magic value of zero is in the counter this indicates that were ready to begin the treatment. Albeit in this case with the wrong parameters. So if the operator started treatment when the counter was accidentally overflowed and incremented to a value of zero you can imagine the tragic results that would occur. Slide 15: Therac: Findings Well after all this unfolded in addition to finding the root cause of a couple of these bugs that we just discussed. A bunch of other troubling other problems, defects, breakdowns were discovered. First it was deemed that the software development process was woefully inadequate and immature essentially resulting an un-testable software. They also determined that the reliability modelling was very incomplete and the failure mode analysis was also very inadequate. There was no independent review of the critical software. In fact its not clear just how detailed the review was even within the organization. Theres improper reuse of software from older models when it wasnt a direct carry over. Lastly one of the issues that resulted in the raise condition was even though a multi-tasking operating system was used there was improper inter task synchronization thus leading to one of the problems. A couple other points of note here first is the system was implemented entirely in assembly language and the second point dimension is that the system used its own in-house operating system. Now granted both of these were more common 30 years ago. It was not unheard of either of these techniques but I think today we can all agree either of these practices would be frowned upon. Slide 16: Ariane 5 / Flight 501 Our second case study is going to focus on Ariane 5 rocket. And specifically were going to talk about its maiden flight which is in June 1996, also known as Flight 501. A little background on Ariane 5, it was the successor to the smaller Ariane 4 rocket. Ariane 5 was designed to carry much bigger, heavier payloads and in fact today its really the standard launch vehicle for the European Space Agency. The payload for this maiden flight was named Cluster and it consisted of four fairly heavy each about 1,200 kilogram spacecraft and the mission for these spacecraft was to study the earths magnetosphere. Unfortunately as were about to see this payload never made it into space and the mission was scrapped. Slide 17: Flight 501 Failure So what actually happened How did Fight 501 fail Well talk about the root cause on the next slide but first I want to describe what actually happened. About 37 seconds into the launch both inertial navigation systems, it had two systems on there, two computers running the same software. They crashed, they were on the same software so the same defect that resulted one crashing resulted the other one crashing, no surprise in hindsight. This caused the thrusters to steer or swivel into extreme positions which were absolutely not correct for the flight path that was indented. So the vehicle parted from its intended flight path and especially at the speeds the rocket was travelling at this put, extreme stresses on the rocket and mechanically speaking the vehicle actually began to break up and this triggered the onboard flight termination system. So it deliberately self-destructed, thank goodness, but then result was that the mission was at failure and the cause including the destroyed payloads was approximately 370 Million dollars. Slide 18: Flight 501 - Cause Now lets talk about the cause of the failure. The Inertial navigation system on Ariane 5 was reused from Ariane 4 but unfortunately the assumptions about Ariane 4 didnt necessarily apply to Ariane 5. Flight 501 had a much greater horizontal velocity and this value was tracked in a 64-bit floating point value. In the software this needed to be converted and stored into a 16-bit integer but because of the greater velocity there was an overflow when this conversion took place. And another tragic thing is that the overflow checks that could have caught this and tried to react or recover from this were omitted for efficiency. Something I want to point out here is that the implementation language here was not C even though this could have happen in the C programming language but actually the programming language here was Ada and this just goes to show you even though the language like Ada is renowned for being a more safe language a lot of avionics and things like that are implemented in Ada. It just goes to show you certain kinds of defects the programming language cant keep you from making. Another sad one about this whole thing is that the software that had this error which caused the thrusters to swivel under the wrong direction and put the rocket off track this was not even needed after launch. So this software such not have even been running along the rocket while have been launched. Slide 19: MISRA C: 2012 So I mention on the previous slide that some of this runtime checking that could have caused the overflow. This checking was admitted for reasons of efficiency. I want to point out is that using a coding standard such as Misra C: 2012 I suspect that everybody listening today is familiar with Misra C the Misra C coding guidelines but if you look at the coding guidelines Im talking about the 2012 version. Directive 4.1 For example says Run-time failures shall be minimized. You could probably rephrase that or get a little bit more verbose and say, Code should be written to anticipate and defend against run-time errors by adding run-time checks throughout the code. We all know that Cs run-time environment is very light-weight, thats what makes the language so efficient and so compact. But you can have things like unchecked array accesses, divided by zero errors, issues with dynamic memory allocation et cetera. Presumably most of you working critical systems dont have dynamic allocation in the first place. But what Im saying is that with Cs very light-weight run-time environment, the burden is actually on you, the programmer to do this kind of error checking. The run-time is not going to trap these kinds of issues and then put you into safe-mode. So, effectively the tactic you really need to adopt is to implement extensive run-time checking in your code. Slide 20: Assertions So, how do we perform this dynamic checking at run-time Well, most programming languages including the C programming language actually have a built-in mechanism for this. So, with C and C theres a macro called assert() and you get it by including the correct pattern file and by using this you can confirm that your assumptions at run-time are actually being adhered to. The expression pass to assert() is expected to always evaluate to true. Lets look at the example here we have one in code we have a function called isInRange. You pass three parameters a lower bound and upper bound, and a value you want to check against. Now, presumably when you call isInRange, lowerbound should be less than or equal to upperbound it doesnt make any sense for the lower bound to be greater than the upperbound. So, the first thing that this routine does is checked just to make sure that the lowerbound is less than or equal to upperbound. If thats not the case, theres no meaning for comparison you can check to see if something is in range. Slide 21: Removing Assertions So, I just been advocating using assertions to do to perform this dynamic run-time checking, but I need to aknowledge that there is a cost to using them. Theres both a run-time cost in other word is extra CPU processing that has to go on and this is going to increase your code size because these assertion checks do result and greater code. My personal belief is that you should never disable these assertions even in your production code. This does mean that you need to budget enough resources (CPU and memory) to run all assertions enabled it at all times. Presumably if youre using assertions, youre going to use them during testing and during debugging, but what a lot of organizations do is by the time they go to production, they disable assertions and thats very easily done usually through just a single command line compiler switch. Presumably the thinking is that by the time youre going to production, none of the assertions are triggering, you feel pretty good about it so you can go ahead and remove them. In my opinion and other people that I respect a great deal this only makes sense (disabling assertions for production code) if you believe that testing catches all problems and I think we can all agree that testing does not catch all problems. So, the fact that your assertions are no longer triggering or firing in the code in your lab does not mean that on the field but especially with a lot of deployed devices and environments you might not have anticipated. It does not mean that those assertions wont trigger. Were get tricky as if youre very resource constraint which does happen in the embedded world and youre running into a barrier either your CPU limited or memory limited and obvious candidate is just go ahead and disable assertions. I still say that I think thats just almost that never the right thing to do. Now we feel strongly enough about the importance of keeping assertions enable them in your code, if you have that possibility that if youre still not convinced, were going to give you a couple of resources towards the end of this webinar from people whose names you might know and hopefully this will convince you of the importance of keeping these assertions enabled. Because its really one of the most important things you can do to keep your software from going off the rails. Slide 22: Flight 501 - Lessons So, what are some of the key can take a ways or lessons learned from Flight 501 Well, one of the most important is that re-use of software can be hazardous when youre deciding to re-use software from our previous project, the burden is on you, the developer, to ensure that it will function properly in the new environment. Thats not always easy as easy as it sounds, we have two examples of that the Therac-25 and Flight 501. Another issue that we actually covered in our previous webinar on coding standards is that mixing data types and expressions here in this case, its mixing float and integer. It can be very hazardous and it can be a source of very subtle and difficult to define problems. Another perhaps obvious lesson is theres no reason to be executing unnecessary software. As I mentioned the software, the cause that self-destruction of the Ariane 5, didnt even need to be running in the first place. I personally believe that one of the most important lessons, if not the most important lesson from Flight 501, is that you disabled assertions in your production code at your own peril. In other words, if you have the options, if you have the CPU and code space to keep those assertions enabled, I cannot recommend strongly enough that you keep them in your code. And obviously, it goes outside of the discipline of firmware engineering, its extremely important to consider the failure modes of the product that you working on and how those can interact. Slide 23: What About Testing So, just a brief word here about testing before we move on to our next case study. Obviously, all of these systems that were looking at were tested. They underwent significant testing and obviously testing is extremely important but its not sufficient for proving the correctness of your product. Testing can never prove the absence of bugs. Weve seen that over and over again, bugs always escape to the field and these are often the ones are that are very difficult to reproduce. And another thing to keep in mind is that tests typically are software as well and there can be bugs in your test code as well. So, the most important thing to remember is simply that tests are just one part of an overall strategy. Slide 24: Patriot Missile System For our third case study, were going to talk about the Patriot Missile System. The Patriot Missile System is a highly mobile surface-to-air missile system originally designed to shoot down enemy aircraft. It became a household term in 1991 at the onset of the Persian Gulf War where the system had been adapted, modified to shoot down Scud missiles. And apparently there was at least some kind of qualified success. Unfortunately, the Patriot successes, whatever they were, are overshadowed by one particularly deadly failure. Slide 25: Patriot Missile Failure In February of 1991, an enemy missile struck American troop barracks in Saudi Arabia when a battery of Patriot missiles failed to intercept the target the incoming Scud missile. In fact, the Patriot missile never launched. The result was 28 dead soldiers and over 100 other casualties. The root cause was determined to be a software error in the systems clock. Specifically, an accumulated clock drift that worsened the longer the system had been an operation. The problem had been identified approximately two weeks before the incident when the Israeli army reported that theyve noticed that the longer the system was online the less accurate it became, but no patch was available at the time of the incident. The recommended work around was to reboot the system. At the time of the incident at Dhahran, the patriot missile system has been operational for approximately 100 hours. When you work through the math of the software error, this ends up resulting in the clock drift of approximately 1/3 of the second and this translates into a tracking error of approximately 600 meters. One interesting side note: during the Gulf war the patriot missile systems had six firmware updates and each of these updates that was supplied require the missile system to be offline for 1 to 2 hours. Slide 26: The Patriot Software Bug So lets talk a little bit about the details of the software defect that resulted in this incident. The Patriot missile system software had two versions of system time. So, the clock tick on the system was 1/10 of the second, a hundred million seconds. So there was one clock time which was an integer number of ticks, each tick represented 1/10 of a second. Theres also a decimal representation of the system time used elsewhere. Problem is and some of you know theres no way to represent a value of 0.1 in binary. So for example of value of 0.5 or 0.25 so negative exponents a power of 2 were talking binary here can be represented exactly. Something like 0.1 cannot be represented exactly, its whats called a nonterminating sequence. The problem there is the conversion from integer ticks to these decimal values results in rounding error, due the imprecision and after approximately 100 hours, this results in an about of 1/3 of a second drift. Radar works by accurately measuring the timing between pulses and the reflections. For this to work well, everybody has to be working off the same time base. The patriot missile system software had been modified to calculate floating point time more accurately than having previously when it was updated to track these very fast Scud missiles. But, not all parts of the software were updated so the result was the timing measurements were made with their very accurate clock, those timing measurements were being compared against timing measurements made with the less accurate clock which was susceptible to this rounding error. So, when these two compared against each other, there would be a discrepancy. Slide 27: Perils of Floating Point, 1 So, weve got a couple slides that illustrate some of the hazards that you can run into when youre using the floating point. So, Im sure most of you can understand this program here, were going to call test one and test two. We assign a value 0.1 to its standard 32 bit single precision floating point value and then we print it. And sure enough when we print it, we see value of .1 and .2 up out to six decimal places. All looks as expected right, so far so good. Now lets say we just change the print out to print with higher precision, so show us more than what the standard print out shows and look what happens. It turns out that when you assign 0.1 into the variable, its actually stored as a bunch of zeroes and then theres that little one at the end. And lets say you now add another 0.1 to it, look how the error accumulates. Now the 0.2, it turns out its really 0.2 a bunch of zeroes and then a three at the end. Well imagine if this is happening every a tenth of a second for 100 hours, 360,000 times how that error accumulates. Slide 28: Perils of Floating Point, 2 Id like to illustrate one other problem by using floating point, again it comes down to rounding but Im going to show you in a different way. So test3() here, we have two different variables f1 and f2, both single precision floating point values. If you look at this slide while Im talking, youll see that essentially what were doing is were calculating the sum of 0.1, a 0.3, a 0.7 and we do it two different ways in f1 and f2. With f1 we initialize it to 0.1 then we add 0.3 and 0.7, we expect the value to be exactly 1.1. With f2 we initialize it to 0.3 but then to get up to 1.1 we actually have a loop and we add 0.1 to it eight times. And then what we do is we print the two values at the end. And youd expect them to be the same but actually when we run the program and we print the two values look at this. By now its probably not surprising that f1 doesnt print to exactly 1.1 in fact you can see the error here. Its something to be aware but you were aware of that last time. But look at this: f2, even though we essentially expect it to get to the same value, it has a different error. And thats because each time we add a value that cannot be expressed exactly in binary, theres a rounding error. So the more operations you perform to get to the same point, youre going to have a greater error. So the point of these last two slides is simply to make you aware of the fact that using floating point math in computers can be very tricky and you need to be very aware of the potential problems that you can run into. Slide 29: Accumulated Error So getting back to the rounding error due to floating point and the time difference between the clocks. The end result of this discrepancy was that a scud missile, whose launch was detected by early warning satellites, was not tracked correctly by the Patriots ground radar. And the system determined that there was no missile threat. Unfortunately the radar was just looking in the wrong part of the sky due to the clock drift problem. So how did this bug which seems so obvious make it into the field Didnt anybody do any kind of testing Well as Michael mentioned in his keynote in April at EE Live, you can almost imagine the kind of testing and the test procedures that were performed and I would imagine, as Michael mentioned, that each test started by having a newly powered on or rebooted system for a clean slate for a fresh set initial conditions. So this whole concept of running a system for a hundred hours and then seeing how accurate it is probably was in no kind of test plan. Although I certainly hope that such a procedure would be in place today. Things like this, gradual resource leaks, long-term timing drifts, et cetera, may only be found by testing systems in exactly the manner and environment that theyll be used. So thats something very important to keep in mind, especially for critical systems. Slide 30: Patriot Missile Failure: Lessons So what are some of the lessons learned from the Patriot Missile failure Well, obviously mixing data types such as floating point, fixed point, but even mixing signed and unsigned integers as we saw in the previous webinar on our coding standard, this can be very tricky. In fact hopefully you just saw that even using floating point alone by itself could be tricky due to rounding errors and things like that. When youre tracking any kind of quantity, its very important that youre consistent in your units and your data types and that you understand issues of precision, rounding, data type conversion, et cetera. And again at the risk of beating a dead horse, please keep in mind that testing will not catch all problems. Either because the tests are inadequate or there are some defects, some kinds of bugs that are so rare and so unusual that they will slip through testing. And a side note to that, as I mentioned before: the test environment its very important that you replicate the intended environment as closely as possible when youre doing your testing. Slide 31: Mars Climate Orbiter Our final case study today is the Mars Climate Orbiter, one thing I want to point out from the outset is that the problem that I am going to describe was not a bug or defect in the embedded computer on the orbiter. I want to mention that right out front. That said, it does illustrate a software defect problem that I want to cover it because its very important and the consequences were very sad. So, the Mars Climate Orbiter was a small robotic space probe, about 750 lbs so probably about 300 Kg, designed to study the climate and atmosphere of Mars. The probe was launched in December 1998 and began its journey to the red planet. Less than a year later, it had essentially reached Mars. Unfortunately, on September 23, 1999, the orbiter disintegrated as it passed too close to Mars upper atmosphere on the wrong trajectory because incorrect information from ground-based computers had been passed to it. The root cause was that the thrust impulse command for the thrusters was produced in imperial units, specifically pounds-seconds, instead of metric units, newton-seconds as specified. It would be similar to ordering one liter of water and receiving 1 gallon of water. The number is correct but the units are wrong and that makes all the difference in the world. Slide 32: Units are Important So at the risk of stating the obvious, computers are used to calculate things but most of these calculations are performed on quantities that have units. So for example if youre measuring pressure, whatever units youre using maybe its kilopascals or whatever, if youre measuring velocity perhaps its in meters per second or miles per hour. If youre measuring flow it might be liters per minute or milliliters per second or whatever, these all have units. These all have dimensions. And there are essentially two kinds of mistakes that are commonly made. The first kind of mistake is that the same fundamental dimension is used but a different system, imperial or metric, is used and theyre mixed together. So here we have an example in API that youre supposed to call it with the value in meters per second but the bug is that someones passing it a value of velocity in miles per hour. So theyre both velocities but theyre the wrong measurement systems. The other kind of problem, which unfortunately is more common, is a disagreement in the fundamental dimensions. So look at this here we have set acceleration and most of you probably know acceleration is the first derivative of velocity with respect to time. But here what were passing it is essentially a velocity. We calculate the difference between two positions and divide it by the amount of time it took to move from position one to position two. So what were really passing to it is a velocity not an acceleration. The units dont even match up. So just think about the products that you work on, your own products. If youre working on medical devices, maybe youre measuring flow or pressure or a voltage change over time. If youre working in transportation, velocity, acceleration, temperature, et cetera. So this is everywhere no matter what product youre working on. Slide 33: Dimensional Analysis So now Im going to introduce a term to you, some of you might have heard of this before, some of you this might be a new term, its called dimensional analysis. Now in C and many other programming language, the standard types dont have any concept of units. So here is an example, int speed 1234, well what does that 1234 mean It means whatever the programmer intends it to mean but its not obvious or intuitive just from reading the code what the units are if theres no implicit decimal point in there et cetera. So what we want to do is see if we can find a way to use the languages type system to help us prevent making these kinds of mistake. And it turns out that we can. Im going to show you an example in C and an example in C and Im also going to show you how you can use static analysis tools to catch this kind of problem. Because these are the subtle kinds of problems that are very difficult to debug. Slide 34: Using Flexlint 9 to Expose Dimension / Unit Problems So heres a very simple example of using meaningful typesets and a static analysis tools ability to warn about incorrect mathematical operations. For this demonstration Im using a tool that we use quite a bit called Flexelint, its Flexelent 9 the newest version. The way this works is that we define unique types for distance in meters in meters, distance in time, sec and velocity. The real magic happens on line five, which is a C comment but its special format indicates special options to the static analysis tool in this case Flexelint. Line five tells the tool to perform strong type checking for the types shown in parenthesis. It also indicates the relationship between the types, here showing the relationship between velocity, distance and time. This goes a long way towards addressing Cs very weak type checking. Remember that even though we have a typedef for velocity, as far as the C programming language and the compiler are concerned theyre all just the same types double. And it will let us assign any type of double to a velocity even if that double reflects stock price or the constant pi or any other value. Because a double is just a double to the compiler. Now notice that this is very simple example. You can define as many types as you need and define every appropriate relationship between the types and the static analysis tool will catch that for you. So if you look at lines 10 through 13 in the code and you look at the expressions on the right hand side of the assignment, youll notice that only line 11 doesnt evaluate to a distance over time when you cancel out all the units. And you notice that in the output from the static analysis tool it warns us that on line 11 were making an invalid assignment. And thats all because of the type checking that was enabled on line 5 in the comment. Slide 35: C Dont Use Naked Numbers Now in addition to using a static analysis tool, which I strongly recommend, there are things you could do with even the C programming language to try to at least protect yourself a little bit. And that is to use a somewhat object oriented approach by defining different types or classes with different units. So even if youre using something like an integer for speed, whether youre using centimeters per second or miles per hour, by wrapping that in a structure or a class, the types are not compatible. You cant do direct assignment and if you try to pass a pointer to one type through a function which is expecting a pointer to another type youll get a warning, certainly if youre using a static analysis tool, itll tell you that. So what happens is you wrap these values in classes that you create for each different units that you want to use and then you just pass around the pointers or handles to these different types. So here you see two different classes defined for speed in miles per hour and speed in centimeters per second. And then we have functions, the first two being constructive functions, one to construct speed in centimeters per second object from its natural type and another one to actually convert a speed object thats in miles per hour into a speed object thats in centimeters per second. Presumably we would have all sorts of balance checking and things like that inside these constructive functions. And then the last thing is we have this API adjust speed centimeters per second where if we want to increment or adjust the speed positive or negative, we pass it to speed object that we want to adjust and then the actual adjustment. And notice that the adjustment which we dont want to change is a pointer to a constant because that shouldnt be changed. The first parameter current is the object thats going to be changed. And one other thing Id like to point out is that these encapsulated types wont take up any more storage than the native types. Slide 36: Even Better Use C The last point I want to make regarding dimensional analysis, if you are using C you have all the tools you need to do this in an ideal way where you can catch problems at compile time. C with its stronger type system, you can exploit templates and native programming to enforce dimensional correctness in your calculations at compile time. Unfortunately, I could spend an hour talking about this, people much smarter than me have given long talks on this, its a very important topic. If you want more information on this, if you are using C I would point you to a couple of resources. The first Im giving you the links at the bottom of this slide. The first is a paper from Scott Meyers called Dimensional Analysis in C which goes through it in very good detail. Also theres a boost library called Boost Units, which is used for exactly this purpose. So if youre using C please check out these resources. Slide 37: Filtering Out the Defects So now that weve looked at some of these case studies and weve seen the defects and perhaps how they wouldve been caught, lets turn this into a picture. So the picture we want you to have in mind is this picture here with a series of cascaded screens or filters or sieves. And each one of these represents a step in the process. A tactic that you can use to reduce the defect. So each step in this process extracts defects, removes them so by the time you get to the end, only a very few defects have escaped. Yes its true that were showing a few defects escaping at the end because honestly thats probably the most realistic picture. The point is that theres no one single thing you could do to make your system, your critical system safe and secure. Each of these sieves is going to catch things that the others wont but youre never going to get to perfection. And thats why I emphasize so strongly the importance of keeping in these dynamic run time checks in your code. Certainly theres overlap in these different phases, in other words there are things that you might catch in static analysis that you can also catch in code inspection but the point is each of these has strengths that will catch things that the other phases wont. Slide 38: Key Takeaways So what are the key takeaways here, obviously there is no such thing as bug free software. There is very, very high quality software but theres no such thing really as bug free software. And one of the implications therefore is that testing is not sufficient to catch all bugs. What you really want to do is employ a strategy of defense and depth. So you want to employ everything at your disposal to get your software as robust as it possibly can be. For example, employ a coding standard whether its the MISRA coding guidelines or the Barr Group Coding standard or a combination that defines a safe subset of the C language to keep you from getting into those dusty corners of the language that you dont really understand. Also having a very robust process including static analysis, code inspections, et cetera. is very, very important. And obviously its always better to prevent something in the first place than to have it escape and to fix it. And knowledge, education is one of the most important ways you could do that. So if thats something of interest to you, for example taking a deeper dive into some of these topics weve covered here as well as covering some new topics we havent had time for, consider taking our one day class on developing Safety Critical Firmware on September 23 in Detroit. Slide 39: Further Reading And then lastly in closing, I want to leave you with a few references for further reading. The first item, its an editorial from 2010 in the L. A. Times and it talks about the fleeting nature of difficult to reproduce software problems. And in particular it talks about the importance of using assertions or sanity checks and how one day a team at JPL in Pasadena, California saw an assertion catch something that should have been impossible. Ultimately the triggered assertion led to uncovering a defect that wouldve been almost impossible to reproduce otherwise. The second resource is both an article and a video by Gerard Holzmann. Hes a senior research scientist at NASAs Jet Propulsion Laboratory. And the article and the video describes how the software for the Curiosity Rover was created. And lastly, the third resource is Better Embedded System Software. Its a blog by Phil Koopman, talks in depth about all sorts of aspects of developing firmware for critical systems. Very good reading, I highly recommend it. So thats it, thank you very much for joining us today. We look forward to seeing you next time or maybe even two weeks from now at our one-day course in Detroit. Im now going to turn the presentation back over to Jennifer.
Formulario de solicitud de Forex plus
Sistema de Trading Financiero Mac