Evidence of Scholarly Ability


This page presents evidence of my scholarly ability in computer science as required for the ICS Ph.D. Portfolio. This includes evidence of ability to identify, critically analyze, and research a problem, as well as written communication skills.[**]

1. Publications

1.1 Measured: Student Learning Through Monitoring Existing Buildings’ Energy Use And Occupant Comfort

Reference: Measured: Student Learning Through Monitoring Existing Buildings’ Energy Use And Occupant Comfort. Meguro, W., Paradis, C. To appear In ARCC 2019 Conference.

Access: Full Paper (To Appear)

Abstract: The objectives of this paper are to share the methods and examine the effectiveness of an extracurricular applied research laboratory’s ability to develop architecture students’ energy and comfort knowledge through use of hands on tools and computer simulation. While the components themselves are not novel, their combination is uncommon and contributes an example of a useful, replicable practice. The atypical method uses a research lab to combine extracurricular in-depth, hands-on environmental systems education; community engagement on “real world” buildings; paid student research positions; gradual acquisition of an environmental systems tool kit; and funding from consistent research grants. While the previous qualities exist in architectural education, studies show they are the exception and not the norm. The University of Hawaii at Manoa School of Architecture Environmental Research and Design Lab consistently goes beyond the professional architecture curriculum to deepen students’ knowledge in and affinity for designing and operating energy efficient, comfortable buildings. The students demonstrate initiative, skill mastery, and affinity for environmental systems and building technology subjects during their studies and upon graduation in their career selections.

My Contribution: I assisted with the paper outline, proposed the usage of Bloom’s Taxonomy to strengthen the validity of the results, suggested some of the challenges and positive outcomes, and did part of the literature review.

Statistics:

1.2 Towards Explaining Security Defects in Complex Autonomous Aerospace Systems

Reference: Towards Explaining Security Defects in Complex Autonomous Aerospace Systems. Carlos Paradis, Rick Kazman, and Misty D. Davies. AIAA Scitech 2019 Forum. January. https://doi.org/10.2514/6.2019-0770

Access: Full Paper

Abstract: Current-day autonomous aerospace systems are increasingly being designed with service- oriented architectures. These architectures make it easy to reconfigure off-the-shelf components and capabilities for new missions and to plug-and-play new capabilities as they are developed. However, such components tend to be assured only in isolation. Validation and verification of the overall system, including the interactions of these components, is likely to rely on a contract-based guarantee approach for the components and runtime verification. Specifying contracts in order to insure overall system performance and safety is a difficult problem. In this paper, we discuss the current gaps towards achieving cybersecurity assurance given these architectures. We also briefly discuss an approach to learning these contracts in order to assure security behavior for the Air Force Research Lab (AFRL) UxAS use case.

My Contribution: I created, under supervision of the P.I., the approach to learn the UxAS use case, as shown on this peer-reviewed presented poster.

Statistics:

Reference: C. Paradis, R. Kazman and P. Wang, “Indexing Text Related to Software Vulnerabilities in Noisy Communities Through Topic Modelling,” 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, 2018, pp. 763-768. doi: 10.1109/ICMLA.2018.00121

Access: Full Paper

Abstract: Despite efforts in the security community to quickly index and disseminate vulnerabilities as they are discovered and addressed, there are concerns about how to scale up the knowledge management of vulnerabilities given its dramatic growth rate. To address these concerns, recent research shifted towards more proactive approaches, in particular leveraging text mining methods to improve vulnerability identification and dissemination to security investigators. While providing a starting point for understanding vulnerability trends, recent methods are still reliant on curated identifiers, such as ‘CVE-*’, hence missing the majority of cybersecurity activity. We show that we can leverage overlapping textual themes in software vulnerabilities to identify related software vulnerability discussions without prior knowledge of identifiers. Our method obtained 86% accuracy in identifying related vulnerabilities with minimal pre-processing in a noisy community.

My Contribution: I identified that a portion of data provided by CVE could be used as ground truth, and created a hypothesis leveraging the data to validate our ongoing pipeline. I also wrote all the code for topic modelling, paper visualization, literature review and writing.

Statistics:

1.4 Conway: Law or not?

Reference: Wolfgang Mauerer, Mitchell Joblin, Damian Andrew Tamburri, Carlos Paradis, Sven Apel, and Rick Kazman. Conway: Law or not? In 2018 40th International Conference On Software Engineering, Gothenburg, Sweden, May 2018.

Access: Poster

My Contribution: I wrote a portion of the data pipeline scripts required for the results presented in this work, and the tool of the primary author, which is open source and available here with my name listed.

Statistics:

1.5 Probabilistic Models for One-Day Ahead Solar Irradiance Forecasting in Renewable Energy Applications

Reference: SILVA, Carlos V. A. ; LIM, Lipyeow ; STEVENS, Duane ; NAKAFUJI, Dora. Probabilistic Models for One-Day Ahead Solar Irradiance Forecasting in Renewable Energy Applications. Special Track on Machine Learning on Energy Applications. Internacional Conference on Machine Learning and Applications 2015. 10.1109/ICMLA.2015.137.

Access: Full Paper

Abstract: Solar irradiance forecasting is an important problem in renewable energy management where any dips in solar energy generation must be made up for by reserves in order to ensure an uninterrupted energy supply. In this paper, we study several data mining methods for short term solar irradiance forecasting at a given location. In particular, we apply linear regression, probabilistic models, and naive Bayes classifier to forecast solar irradiance one day ahead, i.e., we forecast what tomorrow’s solar irradiance will be like at sundown today. We evaluate the forecasting performance of our adaptations of the three models using land-based weather data from several weather stations on the island of Oahu in Hawai’i.

My Contribution: I wrote the code for dataset collection, curation, and analysis, and did most of the writing, including literature review.

Statistics:

1.6 Manufacturing execution systems: A vision for managing software development, Journal of Systems and Software

Reference: Martin Naedele, Hong-Mei Chen, Rick Kazman, Yuanfang Cai, Lu Xiao, Carlos V.A. Silva , Manufacturing execution systems: A vision for managing software development, Journal of Systems and Software, Volume 101, March 2015, Pages 59-68, ISSN 0164-1212, http://dx.doi.org/10.1016/j.jss.2014.11.015.

Access: Full Paper

Abstract: Software development suffers from a lack of predictability with respect to cost, time, and quality. Predictability is one of the major concerns addressed by modern manufacturing execution systems (MESs). A MES does not actually execute the manufacturing (e.g., controlling equipment and producing goods), but rather collects, analyzes, integrates, and presents the data generated in industrial production so that employees have better insights into processes and can react quickly, leading to predictable manufacturing processes. In this paper, we introduce the principles and functional areas of a MES. We then analyze the gaps between MES-vision-driven software development and current practices. These gaps include: (1) lack of a unified data collection infrastructure, (2) lack of integrated people data, (3) lack of common conceptual frameworks driving improvement loops from development data, and (4) lack of support for projection and simulation. Finally, we illustrate the feasibility of leveraging MES principles to manage software development, using a Modularity Debt Management Decision Support System prototype we developed. In this prototype we demonstrate that information integration in MES-vision-driven systems enables new types of analyses, not previously available, for software development decision support. We conclude with suggestions for moving current software development practices closer to the MES vision.

My Contribution: I assisted in collecting the data used to validate some of the empirical claims in the work.

Statistics:

1.7 Mining Retention Rules from Student Transcripts: A Case Study of the Information Systems programme at a Federal University

Reference: SILVA, C. V. A. ; SANTOS, M. S. ; Claro, D.B. ; Silva, Veronica ; SILVA, M. ; RIBEIRO, S. ; TELLES, A. R. ; LOPES, D. . Mining Retention Rules from Student Transcripts: A Case Study of the Information Systems programme at a Federal University. Anais do Simpósio Brasileiro de Informática na Educação, v. 1, p. 1, 2013. http://dx.doi.org/10.5753/cbie.sbie.2013.577.

Access: Full Paper

Abstract: Due to the increase in inflows, mainly because of REUNI procedures, and low completion rate also observed in several programmes on Brazilian universities, it is necessary to identify which factors may cause the students to remain in their programmes longer than expected or even leaving the university before their conclusion. In this work, we hypothesize that the combinations of classes that the students have to do in each semester is not appropriated and can cause student retention, leading to a huge loss to the university. Thus, we present a case study of analyzing retention rules in the Information Systems programme at the Federal University of Bahia. With the resulting rule set obtained from mined student transcripts, we discuss how changes can be made in a programme so as to decrease the student retention rates.

My Contribution: I parsed the PDF transcripts into datasets, proposed and implemented the used methods, the presentation of results and the threats to validity to our conclusions.

Statistics:

1.8 An exploratory study to investigate the impact of conceptualization in god class detection.

Reference: SANTOS, JOSÉ A. M. ; DE MENDONÇA, MANOEL G. ; SILVA, CARLOS V. A. . An exploratory study to investigate the impact of conceptualization in god class detection. In: the 17th International Conference, 2013, Porto de Galinhas. Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering - EASE ‘13. New York: ACM Press, 2013. p. 48. DOI: 10.1145/2460999.2461007.

Access: Full Paper

Abstract: Context: The concept of code smells is widespread in Software Engineering. However, in spite of the many discussions and claims about them, there are few empirical studies to support or contest these ideas. In particular, the study of the human perception of what is a code smell and how to deal with it has been mostly neglected. Objective: To build empirical support to understand the effect of god classes, one of the most known code smells. In particular, this paper focuses on how conceptualization affects identification of god classes, i.e., how different people perceive the god class concept. Method: A controlled experiment that extends and builds upon another empirical study about how humans detect god classes [19]. Our study: i) deepens and details some of the research questions of the previous study, ii) introduces a new research question and, iii) when possible, compares the results of both studies. Result: Our findings show that participants have different personal criteria and preferences in choosing drivers to identify god classes. The agreement between participants is not high, which is in accordance with previous studies. Conclusion: This study contributes to expand the empirical data about the human perception of code smells. It also presents a new way to evaluate effort and distraction in experiments through the use of automatic logging of participant actions.

My Contribution: I wrote the code to perform the statistical analysis of the events generated when users clicked different interfaces, which is described in the paper.

Statistics:

1.9 Teaching Software Engineering Fundamentals in an Introductory Computer Programming Course

Reference: ALMEIDA, E.S ; MACHADO, I. C. ; SILVA, C. V. A. ; GOMES, G. S. S. . Teaching Software Engineering Fundamentals in an Introductory Computer Programming Course. In: IV Fórum de Educação em Engenharia de Software (FEES), 2011, São Paulo. XXV Simposio Brasileiro de Engenharia de Software (SBES), 2011.

Access: Full Paper

Abstract: Programming is an important part of software development and welleducated professionals is critical for the industry needs. However, in general, the computer programming courses are just focused on the language, constructions, structures, and so on. In this paper, we present a new approach to teach an introductory computer programming course based on software engineering fundamentals. The approach has been applied since 2009 in our university and the results are promising.

My Contribution: I helped controlling the quasi-experiment during its execution over the semesters, helping assess participants concerns. I also helped in the organization of data collection and exploratory analysis and ideas of the presentation of results.

Statistics:

2. Literature Review

2.1 Identifying Cybersecurity Software Vulnerabilities from a Defender and Attacker Standpoint: Literature Review

Reference: Identifying Cybersecurity Software Vulnerabilities from a Defender and Attacker Standpoint: Literature Review Carlos Paradis.

Access:

Abstract: The goal of this literature review is to discuss the state of the art in helping security and safety analysts preventing software vulnerabilities. The literature suggests two main approaches: The first assumes that the source of vulnerabilities are in the code of an application, and therefore the infrastructure that supports software development can be studied, and potentially augmented to prevent software faults, in particular security bugs. The second assumes that, beyond code, software vulnerabilities are concepts in a programmer’s mind, in particular attackers, that span multiple vulnerabilities independent of a particular software application and it’s infrastructure. Vulnerabilities can therefore be studied in online forums, and social media, and their concepts leveraged to prevent software vulnerabilities even before code is written. Following the two main threads of current and past literature, this literature review discusses what work has been done in these two areas, and proposes research opportunities for future work.

3. Master Thesis - Plan A

3.1 Probabilistic models for one-day ahead solar irradiance Forecasting in renewable energy applications on oahu

Reference: Carlos Paradis. Probabilistic models for one-day ahead solar irradiance Forecasting in renewable energy applications on oahu. Master’s thesis, University of Hawaii at Manoa, School of Innovation, Design and Engineering, 2016.

Access:

Abstract: In order to produce energy, the Hawaiian Islands rely heavily on oil and oil products to fuel their power plants, leading to high electricity costs that help make renewable energy economically competitive, such as solar energy. Solar energy production, however, introduces a new dimension of uncertainty to meet energy load with supply due to climate conditions: We are not guaranteed to have sufficient solar irradiance available the next day to generate the necessary amount of energy for households and businesses. Forecasting 1-day ahead solar forecasting would then be helpful to know how much energy from other sources are necessary to be produced, in order to compensate the lack of solar energy for the following day.

To address the solar irradiance forecasting need, in this thesis we investigate probabilistic models for one-day ahead solar irradiance forecasting. Namely, we investigate how the use of past solar irradiance and other weather variables (e.g. relative humidity, pressure, temperature, etc.) using one or more sites can influence the accuracy of 1-day ahead solar forecasts. We also discuss how different parameters and limitations encountered throughout the usage of our probability models for solar forecasting influence the forecasts. To address the limitations, we present an entropy based probability model.

4. Technical Reports

4.1 A New Perspective on Predicting Maintenance Costs

Reference: Florian Uunk, Rick Kazman, Yuanfang Cai, Noah Black, Carlos Andrade, and Fetsje Bijma. A New Perspective on Predicting Maintenance Costs. Technical report, Drexel University, 2012.

Access: Full Paper

Abstract: In this paper, we present a new approach to correlating file metrics to maintenance effort. We examine the correlations between variations in file metrics and variations in the maintenance effort spent on these files over multiple releases. Because effort data is seldom accurately collected, and is never collected for open source projects, we have employed three novel, broadened and more holistic measures of filelevel maintenance effort: the number of lines of code changed to resolve tasks (churn), the amount of discussion that tasks generated (discussions), and the number of atomic changes made to a file to resolve a task (actions). From the data extracted from multiple Apache projects, we found that a small subset of file metrics were significantly correlated to our effort measures, especially to code churn and actions. The best correlations vary from project to project, suggesting that maintenance effort measurements should be project-specific.

My Contribution: I executed part of the pipeline made by the primary author, as well as creating the necessary regex strings to extend the pipeline to other projects.

[**]Note that in older papers my name appears as Carlos V.A. Silva.