This article is a spin-off piece from the paper “Do We Need Improved Code Quality Metrics?” by myself and Prof. Diomidis Spinellis.
Software metrics have always been on a roller-coaster ride. On one hand, researchers and practitioners have adopted metrics not only to reveal quality characteristics of their programs, but also to combine existing metrics into new ones, and use these to study more complex phenomena. On the other hand, metrics have drawn wide criticism. The majority of this criticism is related to completeness and soundness.
The first case covers the extent of completeness of the implementation details provided to compute the metrics independent from the programming language. For instance, implementation details of metrics in C&K suite are missing, leaving it up to one’s interpretation. Two examples of such deficiencies involve the lack of concrete details for the implementation of the lack of cohesion in methods (LCOM) metric, and the incomplete definition of the coupling between objects (CBO) metric. Particularly, the definition of the CBO metric does not clarify whether both incoming and outgoing dependencies or only outgoing dependencies should be used in the calculation. …
Often, Designite’s audience asks — how Designite is different from other tools especially SonarQube and NDepend. My (shallow) answer to this question is that the focus on design and architecture granularity issues is lacking in other tools while for Designite they are first-class citizens.
To tackle the comparison appropriately, I took a large open-source software NHibernate, carried out analysis using Designite, NDepend, and SonarQube, and compared different aspects of a code quality analysis.
All three tools present an analysis summary though the information differs. …
A Continuous Integration (CI) pipeline, such as GitHub Actions, makes tasks such as compiling and testing software easier and automated. Though you may integrate code quality analysis tools (such as SonarQube, Codacy, and CodeBeat) into your CI pipeline. However, your pipeline may still lack some or all of the following much needed edge depending upon your employed code quality analysis tool.
If you are a big fan of build and test badges posted on your GitHub or any other hosting platform then keep reading; I am going to show how you can add code quality badge within your repository.
Software engineers are using badges within their open-source repositories to explicitly show the state of the build, test as well as test coverage. It helps not only developers working on the repositories but also users and potential contributors to the open-source repository.
Let me introduce you to QScored — An open platform for code quality and ranking. The platform enables you to not only see various code quality aspects of thousands of analyzed open-source repositories but also allows you to upload your code quality analysis report and assign a relative code quality score and rank to your project. …
Today, abundant source code repositories are available on code repository hosting platforms such as GitHub (128 million repositories in Feb 2020). However, a detailed code quality analysis information is not available for the hosted repositories unless we use an existing code quality analysis tool ourselves to analyze the desired repository. Also, even if one analyzes a set of repositories, a relative scale of code quality is non-trivial to establish. …
IntelliJ IDEA is one of the most commonly used IDEs for Java. The community as well as professional editions are actively used by programmers and researchers worldwide. Apart from a myriad of features for developing, testing, and debugging code, the IDE also provides extensive support for refactoring. You may specify which section of code you would like to refactor with a specific refactoring technique, and the IDE will take care of the rest.
However, how you identify what to refactor? Though an experienced programmer can often smell sections of his/her code to refactor, there are still many quality issues that remain unnoticed even from experienced programmers. …
Recently, I was reading “Coders: The making of a new tribe and the remaking of the world” by Clive Thompson to write its review for Computing Reviews; you may find the review online (pay-walled). In the book, he takes the reader on a fascinating journey of the evolution of computing systems and software programmers. In one of the chapters, he tells the story of early programmers in the 50s and 60s. Mary Allen Wilkes was one of such not-so-known legends who studied philosophy and wanted to practice law but somehow landed at MIT as a software programmer to program IBM 706. As one would expect, the life of a programmer is quite different then. She has to print his program on punched cards and wait for the computer to produce results. Often, as it happens with all of us, the program would not produce the expected results and then she has to go back, try to find the problem, reprint the cards and repeat the process. Not to forget here that those days, one computer is often shared among many programmers. …
Code smells occur at all granularities. We may categorize smells based on their scope and impact. Specifically, smells arising within a local scope, typically within a method, could be referred to as implementation smells (such as empty catch block or magic number). Smells that involve properties of a class and the scope of impact comprises a set of classes then they are referred to as design smells (such as god class and multifaceted abstraction). …
Code smells indicate the presence of quality issues in source code. An excessive number of smells make a software system hard to evolve and maintain. In this article, we apply deep learning models based on CNN and RNN to detect code smells without extensive feature engineering, just by feeding the source code in tokenized form.
This article is derived from our paper “On the Feasibility of Transfer-learning Code Smells using Deep Learning” by Tushar Sharma, Vasiliki Efstathiou, Panos Louridas, and Diomidis Spinellis.
Following figure provides an overview of the setup. We download 1,072 C# repositories from GitHub. We use Designite to analyze C# code. We use CodeSplit to extract each method and class definition into separate files from C# programs. Then the learning data generator uses the detected smells to bifurcate code fragments into positive or negative samples for a smell — positive samples contain the smell while the negative samples are free from that smell. Tokenizer takes a method or class definition and generates integer tokens for each token in the source code. We apply pre-processing operation, specifically duplicates removal, on the output of Tokenizer. …
Current literature related to software engineering research, developers’ blogs, as well as software engineering developers’ conferences talk in a great deal about technical debt. It is relevant to ask — why technical debt is an important concern for a software development team? Why should we care about technical debt? What if technical debt of a project increasing especially when the project is functional and the clients of the software are not complaining? Should we pay attention to it? In this text we try to understand technical debt and reason about the need of it to be addressed.
Let us start from the beginning when the term was first introduced. Ward Cunningham introduced the term ‘Technical Debt’ to the world in a 1992 report as the debt that accrues when we knowingly or unknowingly make wrong or non-optimal technical decisions. The decision not only impacts the source code entity concerning the decision but also other technical decisions leading to a higher pile of the debt. In the case of non-payment of the accrued technical debt for a long time, the software increasingly becomes harder to change and in the extreme case, the software product becomes technically bankrupt i.e., it is not feasible to introduce a change reliably in the prescribed time. …
About