Chapter 6: Generation and Reverse Engineering
The term generation refers to the act of automatically producing other artifacts from your models.
For example, given a class diagram, you may want to generate code for the classes in the diagram, or
generate a document that textually describes the classes, or generate a report that depicts certain
facts about the model, etc.
UMLStudio provides a scripting language for writing scripts that perform such
generations. This scripting language is called PragScript and is described in detail in
Chapter 7. This approach has three important benefits:
- It ensures that you have a way of performing generations that are not directly supported by the tool.
- It enables you to create scripts that perform generation tasks specific to your organization.
- It allows us to create and issue new standard scripts (or new versions of existing standard
scripts) without having to release a new version of the tool.
A set of Standard Scripts
are provided with the product that support popular code and document generation formats.
The following diagram illustrates the generation process. PragScript provides a set of functions
that can access the information in a project (e.g., the name of a class, its attributes, its methods).
The PragScript Interpreter interprets the script, which in turn accesses the information in
the project, to produce the output of generation.
Scripts are run by the first six commands in the Tools Menu.
These commands display dialog boxes that allow you to select the script you wish to run, and to set a number of
related options.
The output files created by a generation process appear inside the Generated folder in the explorer pane.
You can view a generated file in two ways:
- If you select the file by clicking on its icon, its contents will be displayed in the drawing
pane. This is an unformatted, read-only view of the file.
- If you double-click the file icon, its contents will be displayed by automatically launching a
separate application. This is a shortcut to the
Tools:Open File Separately command.
Reverse engineering is the opposite of generation: it analyses and parses
program code to produce models that convey the same information visually. For example, reverse engineering
a set of C++ classes will result in a class diagram that depicts the classes and their relationships.
The following diagram illustrates the reverse engineering process.
As of version 7.0, you can either use a script to perform reverse engineering, or use one of the
hard-coded reverse engineering modules. The latter option supports these languages:
C++, C#, Java, CORBA IDL, PHP, and Forte. The former option supports Ada 95 and Maxim.
Use of scripts is more flexible, as it allows you to customize the behavior of reverse engineering to
suit your needs. It also enables you to implement your own reverse engineering script for a language
not currently supported by the tool.
See Chapter 7 for a description of PragScript functions that you can employ
to write reverse engineering scripts.
To reverse engineer program code, use the Tools:Reverse Engineer
command. This command displays a dialog for specifying the files you want to be reverse-engineered and
a set of related options.
Some points worth noting when you reverse engineer code:
- Decide beforehand whether you want to see the class attributes and methods within the resulting
model. If so, then tick the Show All Attribute/Method Annotations in the dialog. Of course, if
you don’t do this, you can always control the annotation of each class via its property sheet
afterwards. However, because the resulting diagram will be more compact, switching annotation on
for a class will necessitate a fair amount of manual rearrangement around the class to make room for its
bigger size.
- Decide beforehand whether you want normal links or step links and set this via the
Looks Menu. For large and complex diagrams, normal links
produce a less cluttered result. However, you can also switch the link looks afterwards.
- Reverse engineering will always capture inheritance and realization (as in Java) relationships.
Other relationships (composition and aggregation) will only be displayed if the
Show Relationships Graphically option is ticked. Each of these relationships is represented by the first
link tool in the template that has the corresponding semantics.
- You don’t need an empty model for reverse engineering. You can also use an existing model
(provided it is of type default or class diagram). Reverse engineering into an existing model will
not disturb any of the existing drawables in the model. The new drawables will be simply added
to the model.
- If you reverse engineer some code to produce a project, then modify that code, and reverse engineer
it again, the changes will be reflected in the project. In other words, reverse engineering the same
class will not result in a duplicate class, but will update the class (assuming that the class code has been
modified; otherwise, it will have no net effect).
- If you are reverse engineering into an empty model, then it is best to set the page matrix size
of the model to one page (as per default setting). UMLStudio automatically adjusts
the page matrix dimensions to provide sufficient room for the resulting model.
- Reverse engineering a large number of classes into the same model may result in a cluttered diagram
that can be hard to make sense of. In such cases, try to logically divide the files into a number of
categories and reverse engineer each category into a separate model.
- Where a class A has a local class B inside it (as in C++), a submodel will be automatically created
for A and B will appear in that submodel. Also, when reverse engineering Java code, the package
specification of each file determines the model hierarchy for the interfaces/classes within it.
A set of predefined scripts are provided with UMLStudio for code and document generation. These scripts
are for:
- C++ Code Generation
- C# Code Generation
- Java Code Generation
- PHP Code Generation
- IDL Code Generation
- Ada 95 Code Generation
- HTML Document Generation
- RTF Document Generation
As of version 4.1, code generation supports the preservation of code manually added by the user.
For example, if you generate code and then make manual changes to it
(e.g., insert statements inside a function), re-generating the code will not destroy your
modifications. Preservation is controlled by a check box in the code generation dialog.
You will notice that in the generated code, special comment blocks appear for this purpose.
For example, inside a generated Java function, you will see the following:
// PRESERVE:BEGIN
// Insert your code here...
// PRESERVE:END
Code that you place within such comment blocks is preserved. UMLStudio generates these
comment blocks automatically, at the appropriate places. You can also manually type them where you need them.
To preserve your code, UMLStudio uses a merge process. For example, suppose that you
have generated a file named Foo.java, and then made manual changes to it.
If you regenerate the file, UMLStudio will:
- Make a copy of the file and name it _old_Foo.java.
- Generate the new file and name it _new_Foo.java.
- Merge _old_Foo.java and _new_Foo.java
to produce Foo.java.
The two intermediate files (_old_Foo.java and _new_Foo.java),
although no longer needed, are not deleted so that you can view them if you wish. For example, if you are
not happy with the outcome of the merge, you can recover the old file.
Only those links and places whose Code Gen flag has been enabled are considered during code generation.
All other drawables are ignored.
The code generated for a place or link depends on its semantics, as
specified by the notation template. Not all semantics result in code generation.
If the selected scope is All Models then all the models are considered (in the order they have been
created in the project). Otherwise, only the current model is considered. For each such model, code is
generated for each place’s master (in the order they have been created in the model). The name of the
generated files will be according to the path specified in the generation dialog, and the name
of each master. The code for each master will be generated in a separate pair of files (e.g.,
Foo.h and Foo.cpp).
If the master has a file name specified in its property sheet, then this file name will be
used. Otherwise, the file name will default to the master name.
As each master is generated, its file names appear in the log window, e.g.,
Generating: Foo.h + Foo.cpp.
For each master, the generated header (.h) file will contain the following:
- Macro definitions to ensure that the header will not be multiply included in other files. E.g.:
#ifndef _Foo_H_
#define _Foo_H_
...
#endif
- For each master inherited/contained by this master, a #include appears for it. E.g.:
#include "Woo.h"
- The class comment (if any) appears next.
- The class declaration then appears, consisting of the following:
- Class name, followed by the base class names. E.g.:
class Foo : public Woo {
- Implicit class member declarations (these are not implicitly generated if explicitly defined by the
user in the class method list):
- Default constructor (with no arguments)
- Copy constructor
- Destructor
- Memberwise assignment operator
- Exception declarations. For each exception, an enclosed class, having the following will
appear:
- Exception comment (if any)
- Exception name
- Exception members (type and name of each member)
- Method declarations, including implicit Get/Set methods for attributes that have their Get/Set
flag set. For each method, the following will appear:
- Method access permissions, if different from current permissions (e.g., public)
- Method comment (if any)
- Method mode (e.g., virtual)
- Method return type (defaulting to int if none specified)
- Method name
- Method parameters: type, name, and default value (if specified) for each parameter. An in
parameter is defined as const. A type-less parameter is defined as
int. An out parameter is
specified as reference (&).
- If the method is constant then the const keyword
- Method throw list (if specified)
- Attribute declarations. For each attribute, the following will appear:
- Attribute access permissions, if different from current permissions (e.g., public)
- Attribute comment (if any)
- Attribute mode (e.g., static)
- Attribute type (defaulting to int if none specified)
- Attribute name
- Attribute cardinality: if greater than 1, then this is generated as an array (e.g., if attribute
x has type double and cardinality 10,
the resulting code will be double x[10];)
- Containment declarations. For each containment, the following will appear:
- Same as attribute, but the type is the contained class
- Relationship declarations. For each relationship, the following will appear:
- Same as attribute, but the type is a pointer to the related class
For each master, the generated implementation (.cpp) file will contain the following:
- Textual inclusion of the associated header file. E.g.:
#include "Foo.h"
- The class comment (if any) appears next.
- For each of the methods generated in the header file, a skeleton body of the method appears next
(in the same order as in the header file).
- For each of the static attributes, an initialization of the attribute appears next. E.g.:
const int Foo::attr = 10;
This uses rules similar to Java Code Generation.
If the selected scope is All Models then all the models are considered (in the order they have been
created in the project). Otherwise, only the current model is considered. For each such model, code is
generated for each place’s master (in the order they have been created in the model). The name of the
generated files will be according to the path specified in the generation dialog, and the name
of each master. The code for each master will be generated in a separate file. If the master has a file
name specified in its property sheet, then this file name will be used. Otherwise, the file name will
default to the master name.
As each master is generated, its file name appears in the log window, e.g.,
Generating: Foo.java. If a
master has the Interface option set in its property sheet, then it will be generated as an interface.
Otherwise, it will be generated as a class.
For each master, the generated file will contain the following:
- Package declaration. This will be applicable if the master appears in a submodel. The package
path will be the same as the submodel path within the project.
- The interface/class comment (if any) appears next.
- The interface declaration then appears, consisting of the following:
- Interface/class name, followed by the extends/implements names. E.g.:
interface Foo extends Woo implements Koo {
- Exception declarations. For each exception, an exception declaration, having the following will
appear:
- Exception comment (if any)
- Exception name
- Exception members (type and name of each member)
- Implicit interface/class member declarations (these are not implicitly generated if explicitly defined by the
user in the class method list):
- Default constructor (with no arguments)
- Finalize method
- Method declarations, including implicit Get/Set methods for attributes, e.g.,
Methods: .... For each method, the following will appear:
- Method comment (if any)
- Method access permissions (e.g., public)
- Method mode (e.g., const)
- Method return type (defaulting to int if none specified)
- Method name
- Method parameters: type and name for each parameter
- Method throws list (if specified)
- Attribute declarations. For each attribute, the following will appear:
- Attribute comment (if any)
- Attribute mode (if const treat as readonly)
- Attribute type (defaulting to int if none specified)
- Attribute cardinality: if greater than 1, then this is generated as an array (e.g., if attribute
x has type double and cardinality 10,
the resulting code will be double[10] x;)
- Attribute name
- Attribute’s initial value (if specified)
- Containment declarations. For each containment, the following will appear:
- Same as attribute, but the type is the contained class
- Relationship declarations. For each relationship, the following will appear:
- Same as attribute, but the type is the related class
This uses rules similar to Java Code Generation.
If the selected scope is All Models then all the models are considered (in the order they have been
created in the project). Otherwise, only the current model is considered. For each such model, code is
generated for each place’s master (in the order they have been created in the model). The name of the
generated files will be according to the path specified in the generation dialog, and the name
of each master. The code for each master will be generated in a separate file. If the master has a file
name specified in its property sheet, then this file name will be used. Otherwise, the file name will
default to the master name.
As each master is generated, its file name appears in the log window, e.g.,
Generating: Foo.idl.
For each master, the generated file will contain the following:
- Macro definitions to ensure that the file will not be multiply included in other files. E.g.:
#ifndef _Foo_IDL_
#define _Foo_IDL_
...
#endif
- For each master inherited/contained/related-to by this master, a #include appears for it. E.g.:
#include "Woo.h"
- The interface comment (if any) appears next.
- The interface declaration then appears, consisting of the following:
- Interface name, followed by the base class names. E.g.:
interface Foo : Woo {
- Method declarations. For each method, the following will appear:
- Method comment (if any)
- Method mode (if const then treat as oneway)
- Method return type (defaulting to int if none specified)
- Method name
- Method parameters: type and name for each parameter. If no in
or out direction is specified, then
assume in.
- Method raises list (if specified)
- Method context (if specified)
- Attribute declarations. For each attribute, the following will appear:
- Attribute comment (if any)
- Attribute mode (if const treat as readonly)
- Attribute type (defaulting to int if none specified)
- Attribute name
- Attribute cardinality: if greater than 1, then this is generated as an array (e.g., if attribute
x has type double and cardinality 10, the
resulting code will be double x[10];)
- Containment declarations. For each containment, the following will appear:
- Same as attribute, but the type is the contained class
- Relationship declarations. For each relationship, the following will appear:
- Same as attribute, but the type is the related class
This uses rules similar to C++ Code Generation.
This is largely the same as RTF Document Generation, but produces more
flexible output in that:
- The information is color coded for ease of viewing.
- There is extensive use of hyper links for ease of browsing.
The name of the generated document will be according to the path and name specified in the
generation dialog. If no name is specified, then it will default to the project name. The generated
document will contain the following:
- If the selected scope is All Models, then an appropriate heading is included.
- The documentation for the models is included next. If the scope is Selected Objects or
Current Model, then only the current model is included. Otherwise, all models are included
(in the order they have been created in the project).
For each model included, the following will appear:
- An image of the model (if the user has selected the image inclusion option in the
document generation dialog).
- A heading for the model, e.g., ‘Model: Company (public)’
- The parent master (if any), e.g., ‘Parent: Enterprise’
- The model comment (if any), e.g., ‘Comment: ...’
- The masters used by the model (if any), e.g., ‘Contains: Employee, Department’
- The place masters are included next (in the order they have been created in the project). Only
those masters appearing in the included models are considered. For each master included, the
following will appear:
- A heading for the master, e.g., ‘Component: Employee (public Class)’
- The base classes for it (if any), e.g., ‘Bases: public Person, protected Worker’
- The master’s submodel (if any), e.g., ‘Submodel: ReviewProcess’
- The master comment (if any), e.g., ‘Comment: ...’
- The master methods (if any), including implicit get/set methods for attributes, e.g., ‘Methods: ...’.
For each method, the following will appear:
- Method access permissions (e.g., public)
- Method mode (e.g., virtual)
- Method name
- Method parameters (direction, name, type, and default value for each)
- Method return type
- Method comment
- The master attributes (if any), e.g., ‘Attributes: ...’. For each attribute, the following will
appear:
- Attribute access permission (e.g., private)
- Attribute mode (e.g, static)
- Attribute type
- Attribute cardinality (e.g., [10])
- Attribute’s initial value (e.g., = 10.5)
- Attribute comment
- The master containments (if any), e.g., ‘Has: ...’. For each containment, the following will
appear:
- Same as attribute, but the type is the contained class
- The master relationships (if any), e.g., ‘Relations: ...’. For each relationship, the following
will appear:
- Same as attribute, but the type is the related class
- The master exceptions (if any), e.g., ‘Exceptions: ...’. For each exception, the following will
appear:
- Exception name
- Exception members (name, type, and initial value for each)
- Exception comment
- If document generation completes successfully, the Document generation completed message will
appear in the log window. Otherwise, the message Document generation aborted will appear in
the log window.
The new Batch mode (introduced in version 7.0) allows you to run UMLStudio from command line and, using
command line arguments, execute a PragScript (e.g., to generate code or documentation) or perform reverse
engineering (for hard-coded as well as script-based languages).
The command line syntax is as follows:
UMLStudio.exe -batch
[ProjectPath] [-new[=TemplateName]] [-save[=ProjectPath]]
[-exit] [-script=Script] [-scope=Scope] [-folder=FolderPath] [-file=FilePath]
[-revLang=Language] [-revModel=ModelName]
[-add=FileListFile] [-scriptOption=OptionValue]
where:
- Instead of "UMLStudio.exe", you must specify the full path of this executable as per installation on your
system (e.g., "C:\Program Files\UMLStudio 7.0\UMLStudio.exe").
- Each option name appears after a - or / (shown in bold here).
- The order in which the options appear is not significant.
- Anything enclosed in […] is optional.
- User-specified data (e.g., name of a project) appears in italics.
- All strings are case-insensitive.
- Strings containing blanks should be enclosed in double-quotes.
- ProjectPath should be the path of a UMLStudio project (e.g., "C:\Projects\My Project.pro").
- TemplateName should be the name of a UMLStudio notation template (e.g., "UML.not").
- Script should be the qualified name of a PragScript (e.g., "CodeGen\Java Code Generation.pgs").
Note that the script name must start with one of "CodeGen\", "DocGen\", "Export\", "Import\", "Other\",
"RevEng\", or "Task\" to indicate the script’s category.
- Scope specifies the project scope for code/doc generation and import/export scripts. It should
be one of "Selected Objects", "Current Model", or "All Models". If not specified, it defaults to the last
scope saved with the project.
- FolderPath specifies the "Folder Path" field for code/doc generation and import/export scripts.
- FilePath specifies the "File Path" field for code/doc generation and import/export scripts.
- Language must be one of the built-in languages for reverse engineering (i.e., "C++", "Java", "IDL", or
"Forte TOOL").
- ModelName must be the name of a model in the project. If such a model exists in the project, then
the result of reverse engineering will appear in this model. Otherwise, it will appear in the default model
of the project (i.e., the last 'current' model in the project when it was last saved).
- FileListFile is a file which contains the full path of one or more files,
which are to be reverse engineered. The following rules apply:
- ScriptOption should be the name of a script option (e.g., "AllowDebugging"). All spaces within the option are
automatically removed.
- OptionValue should be "true" or "false" (for boolean options), or a string for value options.
- If an option for a script (or reverse engineering) is not specified, then its default value for the project is
assumed.
The batch option is essential, as it allows UMLStudio to distinguish a command line run from double-clicking a UMLStudio
project in Windows Explorer.
After parsing the command line and error checking, UMLStudio executes the command as follows:
- If the exit option is specified, UMLStudio is run faceless. Otherwise, the GUI is displayed.
- If a ProjectPath has been specified then that project is opened. Otherwise, if a new option is specified,
then a new project is created, either using the specified template, or using the default template.
- If the script option is specified, then the nominated script is run. Before this, however, all specified
script options are first bound to their values, and folder and file options are taken into account. If the
script is a RevEng script, then it is run only when the add option is also present.
- The revEng option is noted only when the script option is not used (i.e., you can either reverse engineer
using a script or using one of the hard-coded languages). This option also requires the add option to be present.
- If the save option is specified, then the project is saved.
- If the exit option is specified, UMLStudio is exited. Otherwise, it will remain visible and continue running.
The following command line opens a project (Test.pro), reverse engineers a set of Java files (listed in JavaFiles.txt)
into the project model called "My Model", saves the project, and then exits. Note how one of the reverse engineering options
(ShowAllAttributesAndMethods) is set to false.
"C:\Program Files\UMLStudio7\UMLStudio.exe" -batch "C:\Projects\Test.pro"
-revLang=Java -revModel="My Model" -add="C:\Projects\JavaFiles.txt"
-ShowAllAttributesAndMethods=false -save -exit
The following command line reverse engineers a set of Ada95 files (listed in AdaFiles.txt) and saves them in a
new project.
"C:\Program Files\UMLStudio7\UMLStudio.exe" -batch -new=UML.not -script="RevEng\Ada 95.pgs"
-add="C:\Projects\AdaFiles.txt" -save="C:\Projects\Test2.pro" -exit
The following command line opens a project (Test.pro), and generates HTML documentation for it.
"C:\Program Files\UMLStudio7\UMLStudio.exe" -batch "C:\Projects\Test.pro"
-script="DocGen\Html Generation.pgs" -folder="C:\Projects\docs" -file="Mydoc.html"
-SortByName=true -exit
Table of Contents | Chapter 5 | Chapter 7
Copyright © 1996-2009 PragSoft Corporation (www.pragsoft.com)