Where is a similarity score where smaller values represent closer matches. The x, ,w and h variables correspond to the x & y locations of the top and left-hand borders of the bounding box, and the height and width of the bounding boxes for the mock-up and implementation GCs respectively. The result is a list of GCs that should logically correspond to one another (corresponding GCs).
It is possible that there exist instances of missing or extraneous components between the mock-up and implementation. To identify these cases, our KNN algorithm employs a GC-Matching Threshold (MT). If the similarity score of the nearest neighbor match for a given input mock-up GC exceeds this threshold, it is not matched with any component, and will be reported as a missing GC violation. If there are unmatched GCs from the implementation, they are later reported as extraneous GC violations.
Also, there may be cases where a logical GC in the implementation is represented as small group of mock-up GCs. GVT is able to handle these cases using the similarity function outlined above. For each mock-up GC, GVT checks whether the neighboring GCs in the mockup are closer than the closest corresponding GC in the implementation. If this is the case, they are merged, with the process repeating until a logical GUI-component is represented.
Stage 3: Design Violation Detection
In the Design Violation Detection stage of the GVT workflow, the approach uses a combination of computer vision techniques and heuristic checking in order to effectively detect the differentiate between orthogonal categories of DVs.
Perceptual Image Differencing. In order to determine corresponding GCs with visual discrepancies GVT uses a technique called Perceptual Image Differencing (PID) that operates upon the mock-up and implementation screenshots. PID utilizes a model of the human visual system to compare two images and detect visual di erences, and has been used to successfully identify visual discrepancies in web applications in previous work. We use this algorithm in conjunction with the GC information derived in the previous steps of G to achieve accurate violation detec- tion. For a full description of the algorithm, we refer readers to the PID project. The PID algorithm uses several adjustable parameters including: F which corresponds to the visual field of view in degrees, L which indicates the luminance or brightness of the image, and C which adjusts sensitivity to color differences. The values used in our implementation are stipulated at the end of this section.
The output of the PID algorithm is a single difference image (Fig. 1-3 ) containing difference pixels, which are pixels considered to be perceptually different between the two images. After processing the difference image generated by PID, GVT extracts the implementation bounding box for each corresponding pair of GCs, and overlays the box on top of the generated di erence image. It then calculates the number of di erence pixels contained within the bounding box where higher numbers of di erence pixels indicate potential visual discrepancies. Thus, GVT collects all “suspicious" GC pairs with a % of difference pixels higher than a Difference Threshold DT . This set of suspicious components is then passed to the Violation Manager (Fig. 1-3 ) so that specific instances of DVs can be detected.
Detecting Layout Violations. The first general category of DVs that GVT detects are Layout Violations. There are six specific layout DV categories that relate to two component properties: (i) screen location (i.e., <x,y> position) and (ii) size (i.e., <h,w> of the GC bounding box). GVT first checks for the three types of translation DVs utilizing a heuristic that measures the distance from the top and left-hand edges of matched components. If the difference between the components in either the x or dimension is greater than a Layout Threshold (LT), then these components are reported as a Layout DV . Using the LT avoids trivial location discrepancies within design tolerances being reported as violations, and can be set by a designer or developer using the tool. When detecting the three types of size DVs in the derived design violation taxonomy, GVT utilizes a heuristic that compares the width and height of the bounding boxes of corresponding components. If the width or height of the bounding boxes di er by more than the LT , then a layout violation is reported.
Detecting Text Violations. The next general type of DV that GVT detects are Text Violations, of which there are three specific types: (i) Font Color, (ii) Font Style, and (iii) Incorrect Text Content. These detection strategies are only applied to pairs of text-based components as determined by uiautomator information. To detect font color violations, GVT extracts cropped images for each pair of suspicious text components by cropping the mock-up and implementation screenshots according to the component’s respective bounding boxes. Next, Color Quantization (CQ) is applied to accumulate instances of all unique RGB values expressed in the component-specific images. This quantization information is then used to construct a Color Histogram (CH) (Fig. 1-3). GVT computes the normalized Euclidean distance between the extracted Color Histograms for the corresponding GC pairs, and if the Histograms do not match within a Color Threshold (CT) then a Font-Color DV is reported and the top-3 colors (i.e, centroids) from each CH are recorded in the GVT report. Likewise, if the colors do match, then the PID discrepancy identified earlier is due to the Font-Style changing (provided no existing layout DVs), and thus a Font-Style Violation is reported. Finally, to detect incorrect text content, GVT utilizes the textual information, preprocessed to remove whitespace and normalize letter cases, and performs a string comparison. If the strings do not match, then an Incorrect Text Content DV is reported.
Detecting Resource Violations. GVT is able to detect the following resource DVs: (i) missing component, (ii) extraneous component, (iii) image color, (iv) incorrect images, and (v) component shape. The detection and distinction between Incorrect Image DVs and Image Color DVs requires an analysis that combines two different computer vision techniques. To perform this analysis, cropped images from the mock-up and implementation screenshots according to corresponding GCs respective bounding boxes are extracted. The goal of this analysis is to determine when the content of image-based GCs differ, as opposed to only the colors of the GCs differing. To accomplish this, GVT leverages PID applied to extracted GC images converted to a binary color space (B-PID) in order to detect di erences in content and CQ and CH analysis to determine di er- ences in color (Sec. 4.4.3). To perform the B-PID procedure, cropped GC images are converted to a binary color space by extracting pixel intensities, and then applying a binary transformation to the intensity values (e.g., converting the images to intensity independent black & white). Then PID is run on the color-neutral version of these images. If the images differ by more than an Image Difference Threshold (IDT), then an Incorrect Image DV (which encompasses the Component Shape DV ) is reported. If the component passes the binary PID check, then G utilizes the same CQ and CH processing technique described above to detect image color DVs. Missing and extraneous components are detected as described earlier.
Generating Violation Reports. In order to provide developers and designers with e ective information regarding the detected DVs, GVT generates an html report that, for each detected violation contains the following: (i) a natural language description of the design violation(s), (ii) an annotated screenshot of the app im- plementation, with the a ected GUI-component highlighted, (iii) cropped screenshots of the a ected GCs from both the design and implementation screenshots, (iv) links to a ected lines of application source code, (v) color information extracted from the CH for GCs identified to have color mismatches, and (vi) the difference image generated by PID. The source code links are generated by matching the ids extracted from the uiautomator information back to their declarations in the layout xml files in the source code (e.g., those located in the /res/ directory of an app’s source code).
GVT Parameters
Using the acceptance tests and feedback from our collaborators at Huawei we tuned the various thresholds and parameters of the tool for best performance. The PID algorithm settings were tuned for sensitivity to capture subtle visual inconsistencies which are then later ltered through additional CV techniques: F was set to 45 , L was set to 100cdm2, and C was set to 1. The GC-Matching Threshold (MC) was set to 1/8th the screen width of a target device; the DT for determining suspicious GCs was set to 20%; The LT was set to 5 pixels (based on designer preference); the CT which determines the degree to which colors must match for color-based DVs was set to 85%; and nally, the IDT was set to 20%. GVT allows for a user to change these settings if desired, additionally users are capable of defining areas of dynamic content (e.g., loaded from network activity), which should be ignored by the GVT analysis.
Tools used for GVT Implementation
We provide the tools we used in our implementation of the GUI-Collection, Design Violation Detection, and Report Generation Stages of GVT.
Tools used to implement GUI-Collection:
tools used to Implement Design Violation Detection:
tools used to Implement Report Generation:
Accessing GVT & Documentation
We developed GVT in collaboration with Huawei, and we must control access to GVT. We are able to share both binaries and source code of the tool to researchers and open source developers. Please click the button to fill out the form below and we will share the corresponding materials with you.