diff --git a/Ghidra/Features/Decompiler/certification.manifest b/Ghidra/Features/Decompiler/certification.manifest
index 5e64a5bf6d..ccf3b465a8 100644
--- a/Ghidra/Features/Decompiler/certification.manifest
+++ b/Ghidra/Features/Decompiler/certification.manifest
@@ -1,6 +1,7 @@
##VERSION: 2.0
##MODULE IP: Crystal Clear Icons - LGPL 2.1
##MODULE IP: FAMFAMFAM Icons - CC 2.5
+##MODULE IP: Modified Nuvola Icons - LGPL 2.1
##MODULE IP: Oxygen Icons - LGPL 3.0
##MODULE IP: Tango Icons - Public Domain
Module.manifest||GHIDRA||||END|
@@ -75,6 +76,8 @@ src/main/help/help/topics/DecompilePlugin/images/ForwardSlice.png||GHIDRA||||END
src/main/help/help/topics/DecompilePlugin/images/Undefined.png||GHIDRA||||END|
src/main/help/help/topics/DecompilePlugin/images/camera-photo.png||Tango Icons - Public Domain|||Tango|END|
src/main/help/help/topics/DecompilePlugin/images/decompileFunction.gif||GHIDRA||reviewed||END|
+src/main/help/help/topics/DecompilePlugin/images/document-properties.png||Tango Icons - Public Domain|||tango|END|
+src/main/help/help/topics/DecompilePlugin/images/openFolder.png||Modified Nuvola Icons - LGPL 2.1||||END|
src/main/help/help/topics/DecompilePlugin/images/page_edit.png||FAMFAMFAM Icons - CC 2.5||||END|
src/main/help/help/topics/DecompilePlugin/images/page_white_copy.png||FAMFAMFAM Icons - CC 2.5||||END|
src/main/help/help/topics/DecompilePlugin/images/reload3.png||Crystal Clear Icons - LGPL 2.1||||END|
diff --git a/Ghidra/Features/Decompiler/src/main/doc/decompileplugin.xml b/Ghidra/Features/Decompiler/src/main/doc/decompileplugin.xml
index 8eb4b3efb0..4b8fc27786 100644
--- a/Ghidra/Features/Decompiler/src/main/doc/decompileplugin.xml
+++ b/Ghidra/Features/Decompiler/src/main/doc/decompileplugin.xml
@@ -1,4 +1,8 @@
+
+
+]>
Individual machine instructions
make up the biggest source of information when the
- decompiler analyzes a function. Instructions are translated from their
- processor specific form into Ghidra's IR language (see “P-code”),
+ Decompiler analyzes a function. Instructions are translated from their
+ processor-specific form into Ghidra's IR language (see P-code),
which provides both the control-flow behavior of the instruction and the detailed
- semantics describing how the processor and memory state is affected. The translation is controlled by
+ semantics describing how the processor and memory state are affected. The translation is controlled by
the underlying processor model and, except in limited circumstances, cannot be directly altered
- from the tool. Flow Overrides (see below) can change how certain control-flow is translated,
- and, depending on the processor, context registers may affect p-code (see “Context Registers”).
+ from the tool. Flow Overrides (see below) can change how certain control flow is translated
+ and, depending on the processor, how context registers affect p-code (see Context Registers).
Outside of the tool, users can modify the model specification itself.
- See the document "SLEIGH: A Language for Rapid Processor Specification".
+ See the document "SLEIGH: A Language for Rapid Processor Specification."
- Decompiling a function starts by analyzing control-flow starting from the function's
- first instruction. Control-flow is traced to additional instructions using flow information
- from the underlying processor model. All paths are traced through instructions with
- fall through, conditional jump, and other
+ Decompiling a function starts by analyzing the control flow of machine instructions.
+ Control flow is traced from the first instruction, through additional instructions depending
+ on their flow semantics (see P-code Control Flow). All paths are traced through instructions with
+ any form of fall-through or jump
semantics until an instruction with terminator semantics is
- reached, which is usually a "return from subroutine"
- instruction. Flow is not traced into called functions, in this situation. Instructions
+ reached, which is usually a formal return (return from subroutine) instruction.
+ Flow is not traced into called functions, in this situation. Instructions
with call semantics are treated only as if they fall through.
- An entry point is the address of the function's first instruction.
+ An entry point is the address of the instruction first
+ executed when the function is called.
A function body is the set of addresses reached by control-flow
- analysis (and the machine instructions at those addresses).
+ analysis and the machine instructions at those addresses.
The entry point address for a function plays a pivotal role for
- analysis using the Ghidra decompiler. Ghidra generally associates
+ analysis using the Decompiler. Ghidra generally associates
a formal Function Symbol and an underlying
Function object at this address, which are the key elements that
- need to be present to trigger decompilation.
- (See Functions)
+ need to be present to trigger decompilation
+ (see Functions).
The Function object stores the function body, parameters, local variables, and
other information critical to the decompilation process.
Function Symbols and Function objects are generally created automatically by a Ghidra
- analyzer when initially importing a binary executable and running auto-analysis.
- If necessary however, a user can manually create a Function object from the Listing window
- by using Create Function command (pressing the 'f' key), when the cursor
- is placed on the function's entry point.
- (See Create Function)
+ analyzer when initially importing a binary executable and running Auto Analysis.
+ If necessary, however, a user can manually create a Function object from a Listing window
+ by using the Create Function command (pressing the 'F' key), when the cursor
+ is placed on the function's entry point
+ (see Create Function).
Control-flow behavior for a machine instruction is generally determined by its underlying
- p-code (see “P-code Control Flow”), but this can be changed by applying a Flow Override.
+ p-code (see P-code Control Flow), but this can be changed by applying a Flow Override.
A Flow Override maintains the overall semantics of a branching instruction
but changes how the branch is interpreted. For instance, a
- The decompiler automatically incorporates any relevant Flow Overrides into its
+ The Decompiler automatically incorporates any relevant Flow Overrides into its
analysis of a function. This can have a significant impact on results. The
types of possible Flow Overrides include:
- The decompiler automatically incorporates comments from the Program database into its
+ The Decompiler automatically incorporates comments from the Program database into its
output. Comments in Ghidra are centralized and can be created and displayed by multiple
- Program views, including the decompiler. Comments created from a decompiler window will
- show up in the Listing window for instance, and vice versa.
+ Program views, including the Decompiler. Comments created from a Decompiler window will
+ show up in a Listing window for instance, and vice versa.
- For the purposes of understanding comments within the decompiler, keep in mind that:
+ For the purposes of understanding comments within the Decompiler, keep in mind that:
- The decompiler collects and displays comments associated with any address in the
+ The Decompiler collects and displays comments associated with any address in the
formal function body currently decompiling.
The comments are integrated line by line into the decompiled code, and an
individual comment is displayed on the line before the
@@ -238,24 +239,24 @@
Because a single line of code typically encompasses multiple machine instructions,
there is a possibility that multiple comments at different addresses apply to
- the same line. In this case, the decompiler displays each comment on its
+ the same line. In this case, the Decompiler displays each comment on its
own line, in address order, directly before the line of code.
- Because the output of the decompiler can be a heavily transformed version compared
- to the original machine instructions, its possible that individual instructions
+ Because the output of the Decompiler can be a heavily transformed version compared
+ to the original machine instructions, it is possible that individual instructions
no longer have explicit tokens representing them in the output. Comments attached
- to these instruction will still be displayed in the decompiler output with the
+ to these instruction will still be displayed in the Decompiler output with the
closest associated line of code, usually within the same basic block.
- By default, the decompiler displays only the Pre comments
+ By default, the Decompiler displays only the Pre comments
within the body of the function. It also displays Plate
comments, but only if they are attached to the entry point
- of the function. In this case, they are displayed first in the decompiler output,
+ of the function. In this case, they are displayed first in the Decompiler output,
along with WARNING comments, before the function declaration. Other comment
- types can be configured to display in decompiler output, by changing the
- decompiler Display options (See Display <kind-of> Comments).
+ types can be configured to be part of Decompiler output by changing the
+ Decompiler display options (see Display <kind-of> Comments).
- The decompiler may decide as part of its analysis that individual
+ The Decompiler may decide as part of its analysis that individual
basic blocks are unreachable and not display them in the output.
In this case, any comments associated with addresses in the unreachable block
will also not be displayed.
@@ -288,9 +289,9 @@
Warning Comments
- The decompiler can generate internal warnings during its analysis and will incorporate
- them into the output as comments in the same way as the user defined
- comments described above. They are not part of Ghidra's comment system however and
+ The Decompiler can generate internal warnings during its analysis and will incorporate
+ them into the output as comments in the same way as the user-defined
+ comments described above. They are not part of Ghidra's comment system, however, and
cannot be edited. They can be distinguished from normal comments by the word
'WARNING' at the beginning of the comment.
Variable annotations are the most important way to get names and data-types
- that are meaningful to the user incorporated into the decompiler's output.
+ that are meaningful to the user incorporated into the Decompiler's output.
A variable in this context is loosely defined
as any piece of memory that code in the Program treats as a logical entity.
- The decompiler works to incorporate all forms of annotation into its output
+ The Decompiler works to incorporate all forms of annotation into its output
for any variable pertinent to the function being analyzed.
@@ -344,9 +345,9 @@
local to a function.
- Global variables annotations are created from the tool by applying a data-type to a memory
- location in the Listing window, either by invoking a command from the Data
- pop-up menu, or dragging a data-type from the Data Type Manager
+ Global variable annotations are created from the tool by applying a data-type to a memory
+ location in a Listing window, either by invoking a command from the Data
+ pop-up menu or by dragging a data-type from the Data Type Manager
window directly onto the memory location. Refer to the documentation:
- Local variables annotations are created from the Listing from various editor dialogs. See in particular:
+ Local variables annotations are created from the Listing using various editor dialogs. See, in particular:
- All variables belong either to a global or local
- scope, which directly affects how the variable is treated in the decompiler's data-flow
+ All variables belong to either a global or local
+ scope, which directly affects how the variable is treated in the Decompiler's data-flow
analysis.
- Annotations created by applying a data-type directly to a memory location in the listing
+ Annotations created by applying a data-type directly to a memory location in the Listing
are automatically added to the formal global namespace.
Ghidra can create other custom namespaces that are considered global in this sense, and
renaming actions provide options that let individual global annotations be moved into
@@ -452,7 +453,7 @@
create variable annotations that are local to that function.
- A global variable annotation forces the decompiler to treat the memory location as if its value
+ A global variable annotation forces the Decompiler to treat the memory location as if its value
persists beyond the end of the function. The variable must exist
at all points of the function body, generally at the same memory location.
- The decompiler understands traditional primitive data-types, in all their various sizes,
+ The Decompiler understands traditional primitive data-types in all their various sizes,
like integers, floating-point numbers, booleans, and characters. It also understands
pointers, structures, and arrays, letting it support
arbitrarily complicated composite data-types. Ghidra provides
some data-types with specialized display capabilities that don't have a natural representation
- in the high-level language output by the decompiler. The decompiler treats these as
+ in the high-level language output by the Decompiler. The Decompiler treats these as
black-box data-types, preserving the name, but treating the underlying data either as an integer
or simply as an array of bytes.
- The undefined data-types are supported, in their various sizes:
+ The undefined data-types are supported in their various sizes:
undefined1, undefined2,
undefined4, etc. In Ghidra, the undefined
- data-types, let the user specify the size of a variable, while formally declaring that
+ data-types let the user specify the size of a variable, while formally declaring that
other details about the data-type are unknown.
- For the decompiler, undefined data-types as an annotation have the important special meaning
- that the decompiler should let its analysis determine the final data-type presented in the
- output for the variable (See “Forcing Data-types” below).
+ For the Decompiler, undefined data-types, as an annotation, have the important special meaning
+ that the Decompiler should let its analysis determine the final data-type presented in the
+ output for the variable (see Forcing Data-types below).
The void data-type is supported but treated specially by
- the decompiler, as does Ghidra in general. A void can be
+ the Decompiler, as does Ghidra in general. A void can be
used to indicate the absence of a return value in function prototypes, but cannot be used
as a general annotation on variables. A void pointer, void *,
- is possible; the decompiler treats it as a pointer to an unknown data-type.
+ is possible; the Decompiler treats it as a pointer to an unknown data-type.
Integer data-types, both signed and unsigned, are supported up to a size of 8 bytes. Larger
sizes are supported internally but are generally represented as an array of bytes in
- decompiler output. Odd integer sizes are also supported.
+ Decompiler output. Nonstandard integer sizes of 3, 5, 6, and 7 bytes are also supported.
The standard C data-type names: int, short,
@@ -558,7 +559,7 @@
Floating-point sizes of 4, 8, 10, and 16 are supported, mapping in all cases currently to the
float, double,
float10, and float16
- data-types respectively. The decompiler currently cannot display floating-point constants
+ data-types, respectively. The Decompiler currently cannot display floating-point constants
that are bigger than 8 bytes.
- ASCII or Unicode encoded character data-types are supported for sizes of 1, 2, and 4. The size effectively
- chooses between the UTF8, UTF16, and UTF32 character encodings respectively. The standard
+ ASCII- and Unicode-encoded character data-types are supported for sizes of 1, 2, and 4. The size effectively
+ chooses between the UTF8, UTF16, and UTF32 character encodings, respectively. The standard
C data-type names char and wchar_t are
mapped to one of these sizes based on the
processor and compiler selected when importing the Program.
@@ -579,11 +580,11 @@
String
- Terminated strings, encoded either in ASCII or Unicode, are supported. The decompiler converts
+ Terminated strings, encoded either in ASCII or Unicode, are supported. The Decompiler converts
Ghidra's dedicated string data-types like string to
- an "array of characters" data-type, such as char[],
+ an array-of-characters data-type, such as char[],
where the character size matches the encoding.
- A "pointer to character" data-type like
+ A pointer-to-character data-type like
- is also treated as a potential string reference. The decompiler can infer terminated strings if this
+ is also treated as a potential string reference. The Decompiler can infer terminated strings if this
kind of data-type propagates to constant values during its analysis.
- Strings should be fully rendered in decompiler output,
+ Strings should be fully rendered in Decompiler output,
with non-printable characters escaped using either traditional sequences like '\r', '\n' or using Unicode
escape sequences like '\xFF'.
Pointer data-types are fully supported. A pointer to any other supported data-type is
- possible. The data-type being pointed to, whether its a primitive, structure, or another pointer,
- informs how the decompiler renders a dereferenced pointer.
- The decompiler assumes that a pointer variable may refer to an array of
+ possible. The data-type being pointed to, whether it is a primitive, structure, or another pointer,
+ informs how the Decompiler renders a dereferenced pointer.
+ The Decompiler assumes that a pointer variable may refer to an array of
the underlying data-type and will use array notation if there is evidence of more than
one element.
The default pointer size is set based on the processor and compiler selected when the Program is
- imported and generally matches the size of the ram (or equivalent)
- address space. Different pointer sizes within the same Program are possible. The decompiler generally
+ imported and generally matches the size of the ram or equivalent
+ address space. Different pointer sizes within the same Program are possible. The Decompiler generally
expects the pointer size to match the size of the address space being pointed to, but individual
architectures can model different size pointers into the space (such as near pointers).
For processors with more than one memory address space, pointer data-types currently cannot be directly
- annotated to indicate a preferred address space. Where there is ambiguity, the decompiler attempts to
+ annotated to indicate a preferred address space. Where there is ambiguity, the Decompiler attempts to
determine the correct address space from the context of its use within the function.
- Structured data-types are fully supported. The decompiler does not automatically infer structures
- when analyzing a function; it propagates structured data-types into the function from explicitly
- annotated sources, like input parameters or global variables. Decompiler directed creation of
- structures can be triggered by the user, see “Auto Create Structure”.
+ Structure data-types are fully supported. The Decompiler does not automatically infer structures
+ when analyzing a function; it propagates them into the function from explicitly
+ annotated sources, like input parameters or global variables. Decompiler-directed creation of
+ structures can be triggered by the user (see Auto Create Structure).
- Enumerations are fully supported. The decompiler can propagate enumerations from explicitly
+ Enumerations are fully supported. The Decompiler can propagate enumerations from explicitly
annotated sources throughout a function onto constants, which are then displayed with the
appropriate label from the definition of the enumeration. If the constant does not match a
- single value in the enumeration definition, the decompiler attempts to build a matching
+ single value in the enumeration definition, the Decompiler attempts to build a matching
value by or-ing together multiple labels.
- The decompiler can be made to break out constants representing packed flags,
+ The Decompiler can be made to break out constants representing packed flags,
for instance, by labeling individual bit values within an enumeration.
A Function Definition in Ghidra is a data-type that encodes
information about the parameters and return value for a generic/unspecified function.
- A formal function pointer is supported by the decompiler as a pointer
+ A formal function pointer is supported by the Decompiler as a pointer
data-type that points to a Function Definition. A Function Definition specifically encodes:
The Function Definition itself does not encode any storage information. Once the Function
- Definition is associated with a Program, the indicator maps to one of the prototype models for the
- specific processor and compiler. A Function Definition is currently limited to a prototype model
+ Definition is associated with a Program, its generic calling convention maps to one of the
+ specific prototype models for the processor and compiler. The prototype model is then used
+ to assign storage for parameters and return values, wherever the Function Definition is applied.
+ A Function Definition is currently limited to a prototype model
with one of the following names:
- The decompiler performs type propagation as part of its analysis
- on functions. Data-type information is collected from variable annotations (and other sources),
- which is then propagated via data-flow throughout the function to other variables and
+ The Decompiler performs type propagation as part of its analysis
+ on functions. Data-type information is collected from variable annotations and other sources,
+ which is then propagated via data flow throughout the function to other variables and
constants where the data-type may not be immediately apparent.
- With few exceptions, a variable annotation is forcing on the decompiler in the sense
+ With few exceptions, a variable annotation is forcing on the Decompiler in the sense
that the storage location being annotated is considered an unalterable data-type source. During
type propagation, the data-type may propagate to other variables,
but the variable representing the storage location being annotated is guaranteed to have
@@ -746,15 +749,15 @@
The major exception to forcing annotations is if the data-type in the annotation is undefined.
- Ghidra reserves the following names to represent formally undefined data-types:
+ Ghidra reserves specific names to represent formally undefined data-types, such as:
@@ -770,29 +772,29 @@
The number in the name only specifies the number of bytes in the variable.
- The decompiler views a variable annotation with an undefined data-type only as an indication of what name
+ The Decompiler views a variable annotation with an undefined data-type only as an indication of what name
should be used if a variable at that storage address exists. The data-type for the variable is filled in,
using type propagation from other sources.
For annotations that specifically label a function's formal parameters or return value,
- the Signature Source also affects how they're treated by the decompiler.
+ the Signature Source also affects how they're treated by the Decompiler.
If the Signature Source is set to anything other than DEFAULT, there is a forced
- one-to-one correspondence between variable annotations and actual parameters in the decompiler's
- view of the function. This is stronger than just forcing the data-type; the existence (or not) of
+ one-to-one correspondence between variable annotations and actual parameters in the Decompiler's
+ view of the function. This is stronger than just forcing the data-type; the existence or nonexistence of
the variable itself is forced by the annotation in this case. If the Signature Source is forcing and
there are no parameter annotations, a void prototype is forced on the function.
A forcing Signature Source is set typically if debug symbols for the function are read in during
- Program import (IMPORTED), or if the user manually edits the function prototype
+ Program import (IMPORTED) or if the user manually edits the function prototype
directly (USER_DEFINED).
If an annotation and the Signature Source force a parameter to exist, specifying an
- undefined data-type in the annotation still directs the decompiler to fill in
+ undefined data-type in the annotation still directs the Decompiler to fill in
the variable's data-type using type propagation. The same holds true for the return value; an
- undefined annotation fixes the size of the return value, but the decompiler
+ undefined annotation fixes the size of the return value, but the Decompiler
fills in its own data-type.
__stdcall or
- __thiscall. The decompiler makes use of the prototype model, as assigned to the function by the user or
+ Prototype models are architecture-specific, and depending on the compiler, a single Program may make
+ use of multiple models. Subsequently, each distinct model has a name like x86:LE:32:default:windows<Root>/Ghidra/Processors/<Family>/data/languages
*.slaspec files, which may recursively include
one or more *.sinc files. The format of these files is described
- in the document "SLEIGH: A Language for Rapid Processor Specification".
+ in the document "SLEIGH: A Language for Rapid Processor Specification."
.text or .bss section). Because it is in the
- load image directly, an annotation with this storage shows up directly in the .text or .bss section. Because it is in the
+ load image directly, an annotation with this storage shows up directly in any Listing
window and can be directly manipulated there. In much of the Ghidra documentation, these annotations
- are referred to as undefined4 Stack[-0x14]:4 local_14
int EAX:4 iVar1
@@ -2171,32 +2227,32 @@
the annotation, and the value after the colon indicates the number of bytes in the register.
int HASH:5f96367122:4 iVar2
<RETURN>, and it usually
+ and storage location. The storage location applies at the moment control flow exits the function. If it exists, the annotation is shown
+ in a Listing window as part of the Function header with the name <RETURN>, and it usually
corresponds directly with the return value in the (auto). Decompiler output will generally display auto-parameters as explicit variables
+ rather than hiding them.
assume directives, and regions can be generally viewed from the
Register Manager window.
analyzeHeadless command, scripts, or other
- plug-ins that make use of the decompiler service.
+ It is currently analyzeHeadless command, scripts, and other
+ plug-ins that make use of the Decompiler service.
if (false) { ... }
+ halt_unimplemented() at the point of the unimplemented instruction, and
+ control flow does not fall through.
+ if (false) { ... }.
- halt_unimplemented() at the point of the unimplemented instruction, and
- control-flow does not fall through.
- if/else blocks that share the same predicate so that the condition is only
printed once.
+= and <<=, in its output syntax.
/* C style comments */ and // C++ style comments.
NULL. Otherwise the pointer value is represented with the '0' character,
- which is then type cast into a pointer.
+ which is then cast to a pointer.
__stdcall, __thiscall etc. See the
- discussion in <prototype> always has a <prototype> tag always has a <callotherfixup> tag has a
@@ -3316,7 +3419,7 @@
<prototype>, <callfixup>, or <callotherfixup>,
as its single root element. Users can find numerous examples within the compiler
- and processor specification files that come as part of Ghidra's installation.
- See WARNING:. They occur either at the
beginning of the function as part of the function header or at the point in the code directly
- associated with the warning. (See UndefinedFunction. The function
is assigned the default calling convention, and parameters are discovered as part of
- the decompiler's analysis.
+ the Decompiler's analysis.
field_0x.., it may be that the field offset was
- discovered by the decompiler, and the field does not exist in the structure definition.
- In this case, a new field is created at that offset, with the new name and a data-type of "undefined".
+ discovered by the Decompiler, and the field does not exist in the structure definition.
+ In this case, a new field is created at that offset, with the new name and a data-type of
+ <RETURN>) did not exist previously, one is created.
@@ -4624,7 +4755,7 @@
@@ -57,21 +58,21 @@
- The decompiler does not use the formal function body when it computes
- control-flow; it recomputes its own idea of the function body starting from the entry point
+ The Decompiler does not use the formal function body when it computes
+ control flow; it recomputes its own idea of the function body starting from the entry point
it is handed. If the formal function body was created manually, using a selection for instance,
- or in other extreme circumstances, the decompiler's view of the function body may not match
- the formal view. This can lead to confusing behavior, where clicking in a decompiler window
+ or in other extreme circumstances, the Decompiler's view of the function body may not match
+ the formal view. This can lead to confusing behavior, where clicking in a Decompiler window
may unexpectedly navigate the window away from the function.
JMP instruction, which traditionally
represents a branch within a single function, can be overridden to represent a call to a new function.
Flow Overrides are applied by Analyzers or manually by the user.
@@ -263,7 +264,7 @@
- Unlike the Listing window, the decompiler does not alter how a comment is
+ Unlike a Listing window, the Decompiler does not alter how a comment is
displayed based on its type.
All enabled types of comment are displayed in the same way, on
a separate line before the line of code associated with the address.
@@ -276,7 +277,7 @@
Unreachable Blocks
@@ -381,16 +382,16 @@
@@ -596,11 +597,11 @@
- Users should be aware that variable annotations are forcing on the decompiler and may directly
+ Users should be aware that variable annotations are forcing on the Decompiler and may directly
override aspects of its analysis. Because of this, variable annotations are the most powerful way
- for the user to affect decompiler output, but setting an incomplete (or incorrect) data-type as
- part of an annotation may produce poorer decompiler output.
+ for the user to affect Decompiler output, but setting an incomplete or incorrect data-type as
+ part of an annotation may produce poorer Decompiler output.
@@ -762,7 +765,6 @@
@@ -801,9 +803,9 @@
- The decompiler may still use an undefined data-type to label a variable,
+ The Decompiler may still use an undefined data-type to label a variable,
even after type propagation. If a variable is simply copied around within a function and there
- are no other substantive operations or annotations on the variable, the decompiler may decide the undefined
+ are no other substantive operations or annotations on the variable, the Decompiler may decide the undefined
data-type is appropriate.
.text or .bss section). Because it is in the
- load image directly, an annotation with this storage shows up directly in the Listing
+ program section, such as the .text or .bss section. Because it is in the
+ load image directly, an annotation with this storage shows up directly in any Listing
window and can be directly manipulated there. In much of the Ghidra documentation, these annotations
- are referred to as Data. See the section
- Data in particular.
+ are referred to as Data. See the
+ Data section, in particular.
- Although specific architectures may vary, generally a storage location at a load image address + Although specific architectures may vary, a storage location at a load image address generally represents a formal global variable, and the annotation is in scope - across all Program execution. For the decompiler, the storage location is treated as a + across all Program execution. For the Decompiler, the storage location is treated as a a single persistent variable in all functions that reference it. Within a - function, all distinct references to the storage location (varnodes) are merged. The decompiler + function, all distinct references (varnodes) to the storage location are merged. The Decompiler expects a value at the storage location to exist from before the start of the function, and any change to the value must be explicitly represented as an assignment to - the variable in decompiler output. + the variable in Decompiler output.
- Within the Listing window, a stack annotation is displayed as part of the function header - (at the entry point address of the function), with a syntax similar to: + Within a Listing window, a stack annotation is displayed as part of the function header + at the entry point address of the function, with a syntax similar to:
undefined4 Stack[-0x14]:4 local_14
The middle field (the Variable Location field) indicates that the storage location is on the - stack, and the value in brackets indicates the offset of the storage location, relative to the incoming + stack, and the value in brackets indicates the offset of the storage location relative to the incoming stack pointer. The value after the colon indicates the number of bytes in the storage location.
Currently, the entire body of the function is included - in the scope of any stack annotation, and the decompiler will allow only a single variable to exist + in the scope of any stack annotation, and the Decompiler will allow only a single variable to exist at the stack address. A stack annotation can be a formal parameter to the function, but otherwise the - decompiler does not expect to see a value that exists before the start of the function. + Decompiler does not expect to see a value that exists before the start of the function.
- The decompiler will continue to perform copy propagation and other transforms on - stack locations associated with a variable annotation. In particular, within decompiler output, - a specific write operation to a stack address may not show up as an explicit assignment to its variable, - if the value is simply copied to another location. + The Decompiler will continue to perform copy propagation and other transforms on + stack locations associated with a variable annotation. In particular, within Decompiler output, + if the value is simply copied to another location, + a specific write operation to a stack address may not show up as an explicit assignment to its variable.
A variable annotation can refer to a specific register for the processor associated with the Program. In general, such an annotation will be for a variable local to a particular function. - Within the Listing window, this annotation is displayed as part of the function header, with + Within a Listing window, this annotation is displayed as part of the function header, with syntax like:
- For a local variable annotations with a register storage location, there is an expectation that the + For local variable annotations with a register storage location, there is an expectation that the register may be reused for different variables at different points of execution within the function. There may be more than one annotation, for different variables, that share the same register storage location. An annotation is associated with a first use point that describes where - the register first holds a value for the particular variable. (See the discussion - “Varnodes in the Decompiler”) + the register first holds a value for the particular variable (see the discussion - Varnodes in the Decompiler). The entire scope of the annotation is limited to the address regions between the first use point - and any points where the value is read. The decompiler may extend the scope as part of its + and any points where the value is read. The Decompiler may extend the scope as part of its merging process, but the full extent is not stored in the annotation.
Variable annotations can have a temporary register as a storage location. A temporary register is not specific to a processor but is produced at various stages of the decompilation process. See the discussion of the unique - space in “Address Space”. These registers do not have a meaningful name, and - the specific storage address may change on successive decompilations. So within the - Listing window, this annotation is displayed as part of the function header, + space in Address Space. These registers do not have a meaningful name, and + the specific storage address may change on successive decompilations. So, within a + Listing window, this annotation is displayed as part of the function header with syntax like:
The Variable Location field displays the internal hash used to uniquely - identify the temporary register within the data-flow of the function. + identify the temporary register within the data flow of the function.
A temporary register annotation must be for a local variable, and as with an ordinary register, @@ -953,7 +955,7 @@
Every formal Function in Ghidra is associated with a set of variable annotations and other properties that make up the function prototype. Due to the nature of reverse engineering, - the function prototype may only include partial information and may be built up over time. Individual + the function prototype may include only partial information and may be built up over time. Individual elements include:
Each formal input to the function can have a Variable Annotation that describes its name, data-type, - and storage location, at the moment control-flow enters the function. If annotations exist, they are shown - in the Listing Window as part of the Function header, and they usually correspond directly with symbols in the - function declaration produced by the decompiler. + and storage location. The storage location applies at the moment control flow enters the function. + If annotations exist, they are shown + in a Listing window as part of the Function header, and they usually correspond directly with symbols in the + function declaration produced by the Decompiler.
The value returned by a function can have a special Variable Annotation that describes its data-type
- and storage location, at the moment control-flow exits the function. If it exists, the annotation is shown
- in the Listing Window as part of the Function header with the name <RETURN>, and it usually
+ and storage location. The storage location applies at the moment control flow exits the function. If it exists, the annotation is shown
+ in a Listing window as part of the Function header with the name <RETURN>, and it usually
corresponds directly with the return value in the function declaration produced by
- the decompiler.
+ the Decompiler.
+
+ Specific prototypes may require auto-parameters like this
+ or __return_storage_ptr__. These are special input parameters
+ that compilers may use to implement specific high-level language concepts. See the discussion
+ in Auto-Parameters. Within Ghidra, auto-parameters are automatically created by the
+ Function Editor Dialog
+ if the desired prototype requires them.
+ Within a Listing window, auto-parameters look like other parameter annotations, but the storage field shows the
+ string (auto). Decompiler output will generally display auto-parameters as explicit variables
+ rather than hiding them.
- The calling convention used by the function can be specified as part of the function prototype. The convention - is specified by name, referring to the formal “Prototype Model” that describes how storage + The calling convention used by the function is specified as part of the function prototype. The convention + is specified by name, referring to the formal Prototype Model that describes how storage locations are selected for individual parameters along with other information about how the compiler treats - the function. Available models are determined by the processor and compiler, but can be extended by the user. - See “Specification Extensions”. + the function. Available models are determined by the processor and compiler, but may be extended by the user + (see Specification Extensions).
- In the absence of parameter and return value annotations, the decompiler will use the prototype model as + In the absence of input parameter and return value annotations, the Decompiler will use the prototype model as part of its analysis to discover the input parameters and the return value of the function.
- The name "unknown" is reserved to indicate that nothing is known about the calling convention. If - set to "unknown", depending on context, the decompiler may assign the calling convention based on - the Prototype Evaluation option (See Prototype Evaluation), or it + The name unknown is reserved to indicate that nothing is known about the calling convention. If + set to unknown, depending on context, the Decompiler may assign the calling convention based on + the Prototype Evaluation option (see Prototype Evaluation), or it may use the default calling convention for the architecture.
Functions have a boolean property called variable arguments, which can be turned on - if the function is capable of being passed a variable number of inputs. This property informs the decompiler that + if the function is capable of being passed a variable number of inputs. This property informs the Decompiler that the function may take additional parameters beyond any with an explicit variable annotation. This affects decompilation of any function which calls the variable arguments function, allowing - the decompiler to discover unlisted parameters at a given call site. + the Decompiler to discover unlisted parameters at a given call site.
- A function can be marked explicitly as not returning, meaning that once - a call is made to the function, execution will never return to the caller. The decompiler uses this to - compute the correct control-flow in any calling functions. + A function can be marked with the no return property, meaning that once + a call is made to the function, execution will never return to the caller. The Decompiler uses this to + compute the correct control flow in any calling functions.
- If the boolean property in-line is turned on for a particular function, - it directs the decompiler to inline the effects of the function into the decompilation of any of its calling functions. - The function will no longer appear as a direct function call in the decompilation, but all of its data-flow + If the in-line property is turned on for a particular function, + it directs the Decompiler to inline the effects of the function into the decompilation of any of its calling functions. + The function will no longer appear as a direct function call in the decompilation, but all of its data flow will be incorporated into the calling function.
- This is useful for bookkeeping functions, where its important for the decompiler to + This is useful for bookkeeping functions, where it is important for the Decompiler to see its effects on the calling function. Functions that set up the stack frame for a caller or functions that look up or dispatch a switch destination are typical examples that should be marked in-line.
@@ -1033,7 +1050,7 @@This property is similar in spirit to marking a function as in-line. - A call-fixup directs the decompiler to replace any call to the function with a specific + A call-fixup directs the Decompiler to replace any call to the function with a specific chunk of raw p-code. The decompilation of any calling function no longer shows the function call, but the chunk of p-code incorporates the called function's effects.
@@ -1044,7 +1061,7 @@Call-fixups are specified by name. The name and associated p-code chunk are typically defined in the compiler specification for the Program. Users can extend the available set - of call-fixups. See “Specification Extensions”. + of call-fixups (see Specification Extensions).
If the Signature Source is set to anything other than DEFAULT, the - function's prototype information is forcing on the decompiler. See the discussion - in “Forcing Data-types” + function's prototype information is forcing on the Decompiler (see the discussion + in Forcing Data-types).
The input parameter and return value annotations of the function prototype, like - any variable annotations, can be forcing on the decompiler. - See the complete discussion in “Forcing Data-types”. + any variable annotations, can be forcing on the Decompiler + (see the complete discussion in Forcing Data-types). But keep in mind:
| - The input parameters and return value are all forced on the decompiler as a unit based on the + The input parameters and return value are all forced on the Decompiler as a unit based on the Signature Source. They are all forced if the type is set to anything other than DEFAULT; otherwise none of them are forced. |
- If the function prototype's annotations are not forced, the decompiler will attempt to discover the parameters + If the function prototype's annotations are not forcing, the Decompiler will attempt to discover the parameters and return value using the calling convention. The prototype model underlying the calling convention dictates which storage locations can be considered as parameters and their formal ordering.
@@ -1133,7 +1150,7 @@ can be built for the function.- The decompiler will disregard the calling convention's rules in this situation and use the custom storage + The Decompiler will disregard the calling convention's rules in this situation and use the custom storage locations for parameters and the return value. Other aspects of the calling convention, like the unaffected list, will still be used.
@@ -1145,8 +1162,8 @@ Data Mutability- Mutability is a description of how values in a specific memory region - (either a single variable or a larger block) can change during Program execution, based either on + Mutability is a description of how values in a specific memory region, + either a single variable or a larger block, can change during Program execution based either on properties or established rules. Ghidra recognizes the mutability settings:
- Mutability affects decompiler analysis and can have a large impact the output. + Mutability affects Decompiler analysis and can have a large impact on the output.
- Most memory has normal mutability, meaning: + Most memory has normal mutability; the value at the memory location may change over the course of executing the Program, but for a given section of code, the value will not change unless an instruction explicitly writes to it.
Mutability can be set on an entire block of memory in the Program, typically from the Memory Map. - It can also be set as part of a single Variable Annotation. From the Listing Window for instance, + It can also be set as part of a single Variable Annotation. From a Listing window, for instance, use the Settings dialog.
The constant mutability setting indicates that values within the memory region are read-only and don't change during Program execution. If a read-only variable is - accessed in a function being analyzed by the decompiler, its constant value, if present in the - Program's load image, replaces the variable within data-flow for the - function. The decompiler may propagate the constant and fold it in to other operations, which + accessed in a function being analyzed by the Decompiler, its constant value, if present in the + Program's load image, replaces the variable within data flow for the + function. The Decompiler may propagate the constant and fold it in to other operations, which can have a substantial impact on the final output.
The volatile mutability setting indicates that values within the memory region may change unexpectedly, even if the code currently executing does not directly - write to it. If a volatile variable is accessed in a function being analyzed by the decompiler, + write to it. If a volatile variable is accessed in a function being analyzed by the Decompiler, each specific access is replaced with a built-in function call, which prevents constant propagation and other transforms across the access. The built-in functions are named based on - whether the access is a read or write and then the size - of the access. Within decompiler output, the first parameter to a built-in function is a symbol + whether the access is a read or write and on the size + of the access. Within the Decompiler output, the first parameter to a built-in function is a symbol indicating the volatile variable. The function returns a value in the case of a volatile read or takes a second parameter in the case of a volatile write.
- write_volatile_1(DAT_mem_002b,0x20); X = read_volatile_2(SREG); + write_volatile_1(DAT_mem_002b,0x20);
@@ -1214,14 +1231,14 @@ Constant Annotations
- Ghidra provides numerous actions to control how specific constants are formatted or displayed. - An annotation can be applied directly to a constant in the Decompiler Window, which always affects - decompiler output. Or, an annotation can be applied to the constant operand of a specific machine - instruction displayed in the Listing Window. In this case, to the extent possible, the decompiler - attempts to track the operand and apply the annotation to the matching constant in the decompiler output. - However, the constant may be transformed from its value in the original machine instruction during the decompiler's - analysis. The decompiler will follow the constant through simple transformations, but if the constant strays - too far from its original value, the annotation will not be applied. The transforms followed are: + Ghidra provides numerous actions to control how a specific constant is formatted or displayed. + An annotation can be applied directly to a constant in a Decompiler window, which always affects + Decompiler output. Or, an annotation can be applied to the constant operand of a specific machine + instruction displayed in a Listing window. In this case, to the extent possible, the Decompiler + attempts to track the operand and apply the annotation to the matching constant in the Decompiler output. + However, the constant may be transformed from its value in the original machine instruction during the Decompiler's + analysis. The Decompiler will follow the constant through one of the following simple transformations, but + otherwise the annotation will not be applied.
An equate can be applied to a machine instruction with a constant operand by using the Set Equate - menu from the Listing Window. If the decompiler successfully follows the operand to a matching constant, - the equate's name is displayed as part of the decompiler's output as well as in the Listing Window. - A transformed operand is displayed as an expression, where the transforming operations are applied to - the equate symbol (representing the original constant). + menu from a Listing window. If the Decompiler successfully follows the operand to a matching constant, + the equate's name is displayed as part of the Decompiler's output as well as in any Listing window. + A transformed operand is displayed as an expression, where the transforming operation is applied to + the equate symbol representing the original constant.
- Alternately an equate can be applied directly to a constant from the Decompiler Window using its - “Set Equate ...” menu. The constant may or may not have a corresponding instruction - operand but will be displayed in decompiler output using the descriptive string. -
-- + Alternatively, an equate can be applied directly to a constant from a Decompiler window using its + Set Equate... menu. The constant may or may not have a corresponding instruction + operand but will be displayed in Decompiler output using the descriptive string.
- Ghidra can apply a format conversion to integer constants that are displayed - in decompiler output. + Ghidra can apply a format conversion to any integer constant that is displayed + in Decompiler output.
A conversion can be applied to the machine instruction containing the constant as an operand using the Convert menu option - from the Listing Window. If the decompiler successfully traces the operand to a matching constant, - the format conversion is applied in the decompiler output as well as in the Listing Window. + from a Listing window. If the Decompiler successfully traces the operand to a matching constant, + the format conversion is applied in the Decompiler output as well as in the Listing window.
- Alternately, a conversion can be applied directly to an integer constant in the - Decompiler Window using its “Convert” menu option. The constant may or may not - have a corresponding instruction operand but is displayed in decompiler output using the conversion. + Alternately, a conversion can be applied directly to an integer constant in a + Decompiler window using its Convert menu option. The constant may or may not + have a corresponding instruction operand but is displayed in Decompiler output using the conversion.
- Conversions applied by the decompiler are currently limited to: + Conversions applied by the Decompiler are currently limited to:
- An appropriate header matching the format is prepended to the representation string, either "0b", "0x" or just - "0". The decompiler will not switch the signedness of the constant but preserves the signed or unsigned data-type - as determined by analysis. + If necessary, a header matching the format is prepended to the representation string, either "0b", "0x" or just + "0". A conversion will not switch the signedness of the constant; the signed or unsigned data-type associated + with the constant, as determined by analysis, is preserved. If the constant is negative, with a signed data-type, + the representation string will always start with a '-' character.
@@ -1307,15 +1322,15 @@ A register value in this context is a region of code in the Program where a specific register holds a known constant value. Ghidra maintains an explicit list of these values for the Program (see the documentation for Register Values), - which the decompiler can use when analyzing a function. - A register value benefits decompiler analysis, especially if the original compiler was aware - of the constant value, as the decompiler can recover address references calculated as offsets relative to the register + which the Decompiler can use when analyzing a function. + A register value benefits Decompiler analysis, especially if the original compiler was aware + of the constant value, as the Decompiler can recover address references calculated as offsets relative to the register and otherwise propagate the constant.
- A register value is set by highlighting the region of code in the Listing Window and then invoking the
- Set Register Values ... command
- from the pop-up menu. The beginning and end of a region is indicated in the Listing Window with
+ A register value is set by highlighting the region of code in a Listing window and then invoking the
+ Set Register Values... command
+ from the pop-up menu. The beginning and end of a region is indicated in a Listing window with
assume directives, and regions can be generally viewed from the
Register Manager window.
- There is a special class of registers, called context registers whose + There is a special class of registers called context registers whose values have a different affect on analysis and decompilation than described above.
|
- In general, the map between machine instructions and tokens is not one to one because the decompiler
+ In general, the map between machine instructions and tokens is not one-to-one because the Decompiler
transforms its underlying representation of the function.
An instruction may no longer have any operator that corresponds to it in the decompiled result.
Tokens may be transformed from the natural operation of the machine instruction they are associated
@@ -153,28 +153,27 @@
Pressing the
+ Multiple Snapshot windows can be brought up to show decompilation of different functions simultaneously. Snapshot windows are visually distinguished from the main Decompiler window by their colored outline. -- The Snapshot - window, unlike the main window, is not linked to the Listing window - and does not change the function it displays in response to external navigation events. - A Snapshot window can be used to hold a function fixed while the user navigates to - different functions in the Listing or other windows. - Navigating to new functions within a Snapshot window is possible when the window is active. The window responds to the actions
+ + Double-clicking on specific tokens within the Snapshot window may also cause it to navigate + to a new location (see Double-Click). +
@@ -193,7 +196,7 @@
@@ -225,9 +228,9 @@
Tool Bar
If the current location within the Code Browser is in disassembled code, but that code is not contained in a Formal Function Body, - then the decompiler window invents a function body on the fly called an + then the Decompiler invents a function body, on the fly, called an Undefined Function. The background color of the window is changed to gray to indicate this special state. @@ -202,21 +205,21 @@The entry point address of the Undefined Function is chosen by - backtracking through the code's control-flow from the current location to the start of + backtracking through the code's control flow from the current location to the start of a basic block that has no flow coming in except possibly from call instructions. During decompilation, a function body is computed from the selected entry point (as with any function) - based on control-flow up to instructions with terminator semantics. + based on control flow up to instructions with terminator semantics. - The current address, as indicated by the cursor in the Listing Window for instance, is - generally not the entry of the invented function, but the current address will be + The current address, as indicated by the cursor in the Listing for instance, is + generally not the entry point of the invented function, but the current address will be contained somewhere in the body.
For display purposes in the window, the invented function is given a name based on the
computed entry point address with the prefix - This is a group of actions that can be triggered by pressing a button in the tool/title - bar at the top of individual decompiler windows, both main and - Snapshot. The action applies to the function and decompiler results + The following actions are available by pressing the corresponding icon in the title/tool + bar at the top of each individual Decompiler window. + The action applies to the function and Decompiler results displayed in that particular window.
@@ -237,7 +240,7 @@
@@ -279,7 +282,7 @@
Exports the decompiled result of the current function to a file. A file chooser @@ -248,7 +251,7 @@ This action exports a single function at a time. The user can export all functions simultaneously from the Code Browser, by selecting the menu - File -> Export Program ... and then choosing + File -> Export Program... and then choosing C/C++ from the drop-down menu. See the full documentation for the Export dialog. @@ -262,13 +265,13 @@
Creates a new Snapshot window. The Snapshot window - initially displays the same function as the decompiler window on which the action was triggered, - but if that window navigates to other functions, the Snapshot does not - follow and continues to display the original function. (See “Snapshot Windows”) + displays the same function as the Decompiler window on which the action was triggered, + and if that window navigates to other functions, the Snapshot does not + follow but continues to display the original function (see Snapshot Windows).
Triggers a re-decompilation of the current function displayed in the window. @@ -293,8 +296,8 @@ |
| This action is not necessary for normal reverse engineering tasks. Re-decompilation is automatically triggered for all - decompiler windows by any change to the Program, so the most up-to-date decompilation is - always available to the user without this action. This action is a primarily a debugging + Decompiler windows by any change to the Program, so the most up-to-date decompilation is + always available to the user without this action. This action is primarily a debugging aid for plug-in developers. |
- - button
+ - button
- Copies the currently selected text in the decompiler window to the clipboard. + Copies the currently selected text in the Decompiler window to the clipboard.
- This action is located in the drop-down menu on the right side of the decompiler + This action is located in the drop-down menu on the right side of the Decompiler window tool/title bar.
@@ -327,7 +330,7 @@ the current function is collected and saved to an output file in XML format. A file chooser dialog is presented to the user to choose the output file. The file is useful when submitting bug reports - about the decompiler as it is generally much smaller than + about the Decompiler as it is generally much smaller than the entire Program and only contains information specific to the function. Information is generated by performing the full decompilation of the function and collecting all the data and @@ -342,18 +345,13 @@ Graph AST Control Flow
- Generate a control-flow graph based upon the results in the active Decompiler Window, + This action is located in the drop-down menu on the right side of the Decompiler + window tool/title bar. +
++ Generate a control-flow graph based upon the results in the active Decompiler window, and render it using the current Graph Service.
-![]() |
-- |
|---|---|
| - If no Graph Service is available then this action will not be present. - |
- Moves the decompiler window cursor and highlights the token. Within the + Moves the Decompiler window cursor and highlights the token. Within the main window, if a token has a machine address - associated with it, a left click generates a + associated with it, a left-click generates a navigation event to that address, which may cause other - windows to display code near that address. - (See “Cross Highlighting”) + windows to display code near that address + (see Cross-Highlighting).
Selecting a '(' or ')' token causes it and its matching parenthesis to be @@ -385,55 +383,60 @@
- Moves the decompiler window cursor, highlights the token, and brings up the menu of - context sensitive actions. Any highlighting and navigation is identical to a - left click. The menu actions presented depend primarily on the token type and + Moves the Decompiler window cursor, highlights the token, and brings up the menu of + context-sensitive actions. Any highlighting and navigation is identical to a + left-click. The menu actions presented depend primarily on the token type and are tailored to the context at that point in the code.
- Navigates based on the selected symbol or other token (See below). If the selected token represents a formal symbol, - such as a function name or a global variable, double clicking causes a + Navigates based on the selected symbol or other token (see below). + If the selected token represents a formal symbol, + such as a function name or a global variable, double-clicking causes a navigation event to the address associated with the symbol. +
++ This action is performed by clicking twice on the desired token with the left + mouse button.
- Double clicking a called function name causes the + Double-clicking a called function name causes the window itself to navigate away from its current function to the called function, triggering a new decompilation if necessary and changing its display.
- Double clicking a global - variable name does not have any effect on the decompiler window itself, - but other windows, like the Listing window, may navigate to the + Double-clicking a global + variable name does not have any effect on the Decompiler window itself, + but Listing or other windows may navigate to the storage address of the global variable.
- Double clicking a token representing a constant causes the constant to be treated - as an address, and a navigation event to that address is generated. The decompiler + Double-clicking a token representing a constant causes the constant to be treated + as an address, and a navigation event to that address is generated. The Decompiler window itself navigates depending again on whether the address represents a new function or not.
- Double clicking the label within a goto statement causes the window to navigate + Double-clicking the label within a goto statement causes the window to navigate to the target of the goto, within the function. The cursor is set and the window view is adjusted if necessary to ensure that the target is visible.
- Double clicking a '{' or '}' token, causes the window to navigate to the matching brace + Double-clicking a '{' or '}' token, causes the window to navigate to the matching brace within the window. The cursor is set and the window view is adjusted if necessary to ensure that the matching brace is visible.
Opens a new Snapshot window, navigating it to the selected symbol. This is a convenience for immediately decompiling and displaying a called function in a new window, without disturbing the active window. The behavior is similar to the - Double Click action, the selected token must represent a function name symbol or possibly + Double-Click action, the selected token must represent a function name symbol or possibly a constant address, but the navigation occurs in the new Snapshot window.
++ This action is performed by clicking twice on the desired token with the left + mouse button, while holding down the Ctrl key. +
Generates a navigation event to the address, within the current function, associated with the clicked token. This allows Snapshot windows to do basic - cross-highlighting in the same way as the main decompiler window. - A Control double-click causes the Listing and other windows to navigate to and display the same - portion of code currently being displayed in the Snapshot window. (See “Cross Highlighting”) + cross-highlighting in the same way as the main Decompiler window. + A ctrl-shift-click causes Listing and other windows to navigate to and display the same + portion of code currently being displayed in the Snapshot window (see Cross-Highlighting). +
++ This action is performed by clicking on the desired token with the left mouse + button, while holding down both the Ctrl and Shift keys.
- Highlights every occurrence of a variable, constant, or operator under the current - cursor location, within the decompiler window. + Highlights every occurrence of a variable, constant, or operator represented by the selected + token, within the Decompiler window.
@@ -487,8 +498,8 @@All the actions described in this section can be activated from the menu that pops up - when right-clicking on a token within the decompiler window. The pop-up menu is context sensitive and - the type of token in particular (See “Display”) determines what actions are available. + when right-clicking on a token within the Decompiler window. The pop-up menu is context sensitive and + the type of token in particular (see Display) determines what actions are available. The token clicked provides a local context for the action and may be used to pinpoint the exact variable or operation affected.
@@ -504,7 +515,7 @@The structure definition is filled in by examining how the variable is used, assuming it is a pointer to the structure, tracing - data-flow to all the expressions the variable is used in. LOAD and STORE operations + data flow to all the expressions the variable is used in. LOAD and STORE operations trigger new fields and additive offsets are traced to calculate the offset of the fields within the structure definition.
@@ -513,8 +524,8 @@ retyped to be a pointer to the structure. Within the window, the function is decompiled again and references to new fields in the structure should be immediately apparent. These can be renamed or retyped from the window - to further refine the new structure definition. - (See “Rename Variable”) + to further refine the new structure definition + (see Rename Variable).Set or change a comment at the address of the selected token.
- These actions bring up the general Comment dialog (See Comments), - which associates the comment with a specific address in the Program. For the - decompiler actions, this address is of the machine instruction most closely linked to the selected token. - Comments will be visible in the Listing and other Ghidra windows viewing the same + These actions bring up the general Comment dialog (see Comments), + which associates the comment with a specific address in the Program. For comment + actions in the Decompiler, this address is of the machine instruction most closely linked to the selected token. + Any comments generated from a Decompiler window will be visible in Listing and other windows viewing the same section of code.
- The decompiler windows can display all comment types, but this may be affected by the Display options - (See “Comments”). + A Decompiler window can display all comment types, but this may be affected by the Display options + (see Comments).
Brings up the dialog for setting or editing a Plate comment.
Brings up the dialog for setting or editing a Pre comment.
+ Brings up the dialog for setting or editing a comment based on the selected token. + A Plate comment is edited if the token is part + of the function's header. A Pre comment is edited otherwise. +
- Commit the names of any local variables discovered during the decompiler's analysis + Commit the names of any local variables discovered during the Decompiler's analysis to the Program database as new Variable Annotations. The recovered data-type is not committed as part of the annotation, only the name and storage location.
- Parameters are not affected by this command, see “Commit Params/Return”. + Parameters are not affected by this command (see Commit Params/Return). The purpose of the command is to synchronize the local variables in the - decompiler's view of a function with the formal Variable Annotations in the disassembly view, + Decompiler's view of a function with the formal Variable Annotations in the disassembly view, without otherwise affecting the decompilation. After executing this command, additional changes - to local variable can be performed directly on the corresponding annotations in the Listing Window, - using various methods (See “Variable Annotations”). + to local variables can be performed directly on the corresponding annotations displayed in Listing windows, + using various methods (see Variable Annotations). Data-types are not forced for new annotations, they are created with - an undefined data-type, which allows the decompiler to refine + an undefined data-type, which allows the Decompiler to refine its view of the variable's data-type as new information becomes available - (See “Forcing Data-types”). + (see Forcing Data-types).
@@ -595,28 +612,28 @@ Commit Params/Return- Commit the decompiler's analysis of the input parameters and return value of the current + Commit the Decompiler's analysis of the input parameters and return value of the current function as annotations to the Program database.
- In the absence of either imported or user defined - information about a function's prototype, the decompiler performs its own analysis of what + In the absence of either imported or user-defined + information about a function's prototype, the Decompiler performs its own analysis of what the prototype is, determining the storage location and data-type of all parameters and the return value. This action commits this analysis permanently for the current function displayed in the window, creating a matching Variable Annotation for each input - parameter and the return value. The new annotations will be displayed in the - Listing Window as part of the function header, and the action effectively - synchronizes the disassembly view and decompiler's view of the function prototype. + parameter and the return value. The new annotations will be displayed in a + Listing window as part of the function header, and the action effectively + synchronizes the disassembly view and Decompiler's view of the function prototype.
Committed prototype information is used both when decompiling the function itself and when - decompiling other functions that call it. - The committed annotations are forcing on the decompiler, and it will - no longer perform prototype recovery analysis for that function. The decompiler assumes the committed parameters, + decompiling other functions that call it. The committed annotations are forcing + on the Decompiler (see Forcing Data-types), and it will + no longer perform prototype recovery analysis for that function. The Decompiler assumes the committed parameters, and only the committed parameters, exist and will not modify their data-types, with the exception of parameters that are explicitly marked as having an undefined data-type. The user must manually modify individual variables or clear the entire prototype - if they want a change (See “Variable Annotations”). + if they want a change (see Variable Annotations).
@@ -665,10 +682,10 @@This command primarily targets a constant token in the Decompiler window, but if there is a scalar operand in an instruction that corresponds - with the selected constant, the same conversion is also applied to the scalar in the Listing + with the selected constant, the same conversion is also applied to the scalar in any Listing window. This is equivalent to selecting the - Convert command from the - Listing. There may not be a scalar operand directly corresponding to the selected constant, in + Convert command from a + Listing window. There may not be a scalar operand directly corresponding to the selected constant, in which case the conversion will be applied only in the Decompiler window.
@@ -679,20 +696,20 @@
The constant's encoding can be changed by selecting a different Convert command, or it can be returned to its default encoding by selecting - the “Remove Convert/Equate” command. + the Remove Convert/Equate command.
- Copy selected code from the decompiler window into the clipboard. + Copy selected code from the Decompiler window into the clipboard.
This is part of the standard copy - capabilities for all Ghidra windows and is suitable for copying (sections of) decompiler output + capabilities for all Ghidra windows and is suitable for copying (sections of) Decompiler output into other documents.
@@ -713,8 +730,8 @@ Enum Editor.- Any change to the definition of the data-type is automatically incorporated by the decompiler into its output - (see “Variable Data-types”). + Any change to the definition of the data-type is automatically incorporated by the Decompiler into its output + (see Variable Data-types).
@@ -727,7 +744,7 @@ details about how it passes parameters.- The action is available from any token in the decompiler window. Most tokens trigger editing + The action is available from any token in the Decompiler window. Most tokens trigger editing of the current function itself, but a called function can be edited by putting the cursor on its name specifically.
@@ -758,16 +775,16 @@See documentation for the Function Editor Dialog. - The decompiler automatically incorporates any changes into its output. + The Decompiler automatically incorporates any changes into its output.
- Search for strings within the active window, in the current decompiler output. + Search for strings within the active window, in the current Decompiler output.
The command brings up a dialog where a search pattern can entered as a raw string or regular expression. @@ -783,8 +800,8 @@ Highlight
- All these actions highlight a specific set of variable tokens tracing the data-flow - of the selected variable within the current function. Data-flow is the directed flow + All these actions highlight a specific set of variable tokens tracing the data flow + of the selected variable within the current function, defined as the directed flow of data from input variables through operations that manipulate their value to their output variables. The operations and variables chain together to form data-flow paths. @@ -863,7 +880,7 @@ Secondary Highlight
- A secondary highlight is a semi-permanent token highlight in the decompiler + A secondary highlight is a semi-permanent token highlight in the Decompiler window that, unlike normal highlights, will not go away as the user clicks other tokens. The color and text being highlighted is controlled by the user and will persist for for the duration of the Ghidra session or until the user @@ -907,31 +924,33 @@ Override the function prototype corresponding to the function under the cursor.
- This action can be triggered at call sites, where the function + This action can be triggered at a call site, where the function being decompiled is calling into another function. Users must select either the token representing - the called function's name or the tokens representing the function pointer at the call site. - A dialog is brought up where the a complete function declaration, specifying - the return data-type along with the name and data-type for each input parameter. Additionally, - the "Calling Convention", "In Line", and "No Return" properties of the function prototype - can be set (See “Function Prototypes”). + the called function's name or one of the tokens representing the function pointer at the call site. + The action brings up a dialog where the function prototype + corresponding to the call site can be edited. The dialog provides fine-grained control of + the return data-type along with the name and data-type of each input parameter. + The function prototype properties Calling Convention, + In Line, and No Return + can also be set (see Function Prototypes).
- Confirming the dialog forces the new function prototype on the decompiler's view of the called function, + Confirming the dialog forces the new function prototype on the Decompiler's view of the called function, but only for the single selected call site.
This action is suitable for either indirect calls or direct calls to functions taking a variable number of arguments; situations where a complete description of all parameters is not available. For direct calls with a fixed number of arguments, it is almost always better to provide - parameter information by setting the function's prototype directly. See the - “Commit Params/Return” command for instance. In this situation, the "Override Signature" + parameter information by setting the function's prototype directly (see the + Commit Params/Return command). In this situation, the "Override Signature" command is still possible, but it will bring up a confirmation dialog.
- The selected constant must have had either a “Convert” or a - “Set Equate ...” command applied to it. After applying this command, + The selected constant must have had either a Convert or a + Set Equate... command applied to it. After applying this command, the conversion is no longer applied, and the selected constant will be displayed using the decompiler's default strategy, which depends on the data-type of the constant and other display settings (See Integer format). @@ -1010,10 +1029,10 @@
This action can only be triggered at call sites, where an overriding - prototype was previously placed by the “Override Signature” command. As with + prototype was previously placed by the Override Signature command. As with this command, users must select either the token representing the called function's name or the tokens representing the function pointer at the call site. The action causes the - override to be removed immediately. Parameter information will be drawn from the decompiler's + override to be removed immediately. Parameter information will be drawn from the Decompiler's normal analysis.
The current function can be renamed by selecting the name token within the function's - declaration at the top of the decompiler window, or individual called functions + declaration at the top of the Decompiler window, or individual called functions can be renamed by selecting their name token within a call expression. This action brings up a dialog containing a text field prepopulated with the name to be changed. The current namespace (and any parent namespaces) is @@ -1037,8 +1056,8 @@
A new or child namespace can - be specified by prepending the base name with the namespace using the C++ '::' - separator characters. Any namespace path entered this way is considered relative + be specified by prepending the base name with the namespace using the C++ "::" + delimiter characters. Any namespace path entered this way is considered relative to the namespace set in the drop-down menu, so the Global namespace may need to be selected if the user wants to specify an absolute path. If any path element of the namespace does not exist, it is created. @@ -1049,26 +1068,6 @@
- Rename the label corresponding to the token under the cursor. -
-- A label can be renamed by triggering this action while the corresponding label token is - under the cursor. This action brings up the - Edit Label Dialog. -
-- The change will be immediately visible across all references to the label - (including in any Decompiler, Listing, and Functions windows). -
-
If the initial name looks like field_0x.., it may be that the field offset was
- discovered by the decompiler, and the field does not exist in the structure definition.
- In this case, a new field is created at that offset, with the new name and a data-type of "undefined".
+ discovered by the Decompiler, and the field does not exist in the structure definition.
+ In this case, a new field is created at that offset, with the new name and a data-type of
+ undefined.
The change to the definition is visible globally @@ -1093,10 +1093,10 @@ is triggered again to incorporate the new name, but the output is otherwise unaffected.
- Within a decompiler window, field name tokens are presented in context, + Within a Decompiler window, field name tokens are presented in context, showing how they are used within the code flow of the current function. - Combined with “Auto Create Structure” and - “Retype Field”, this action allows a + Combined with Auto Create Structure and + Retype Field, this action allows a structure to be created and filled in based on this context.
A new or child namespace can - be specified by prepending the base name with the namespace using the C++ '::' - separator characters. Any namespace path entered this way is considered relative + be specified by prepending the base name with the namespace using the C++ "::" + delimiter characters. Any namespace path entered this way is considered relative to the namespace set in the drop-down menu, so the Global namespace may need to be selected if the user wants to specify an absolute path. If any path element of the namespace does not exist, it is created.
The change will be immediately visible across all references to the variable, - including the Decompiler and Listing windows. A new decompilation is triggered + including in Decompiler and Listing windows. A new decompilation is triggered to incorporate the new name, but the output is otherwise unaffected.
+ Rename the label corresponding to the token under the cursor. +
++ A label can be renamed by triggering this action while the corresponding label token is + under the cursor. This action brings up the + Edit Label Dialog. +
++ The change will be immediately visible across all references to the label + (including in any Decompiler, Listing, and Functions windows). +
+- Local variables and parameters presented by the decompiler may be invented on-the-fly + Local variables and parameters presented by the Decompiler may be invented on-the-fly and don't necessarily have a formal annotation in Ghidra - (see “Variable Annotations”). Performing this action on + (see Variable Annotations). Performing this action on a variable will create an annotation if one didn't exist previously, which will - generally be visible as part of the function header in the Listing window. + generally be visible as part of the function header in any Listing window. A new annotation will not commit the data-type of the variable, and data-types applied later, and elsewhere in the function, can still propagate into the variable. @@ -1183,7 +1201,7 @@
The change to the definition is visible globally throughout the Program, anywhere the data-type is referenced, and is forcing - on the decompiler (see “Forcing Data-types”). Decompilation is triggered + on the Decompiler (see Forcing Data-types). Decompilation is triggered again, and the new data-type is propagated from the point of the field reference(s). Changes to the output may be large and indirect.
@@ -1204,8 +1222,8 @@The change is visible globally throughout the Program, anywhere the variable is - referenced, and is forcing on the decompiler - (see “Forcing Data-types”). Decompilation is triggered + referenced, and is forcing on the Decompiler + (see Forcing Data-types). Decompilation is triggered again, and the new data-type is propagated from the variable reference(s). Changes to the output may be large and indirect.
@@ -1220,22 +1238,22 @@
This action is only available from the data-type token in the function declaration, at the
- top of the decompiler's output. It brings up a dialog prepopulated with the current
+ top of the Decompiler's output. It brings up a dialog prepopulated with the current
data-type returned by the function. The user can select any fixed length data-type in the Program.
Editing and confirming this dialog immediately changes the data-type. If an annotation for
the return value (named <RETURN>) did not exist previously, one is created.
As input parameter annotations and the return value annotation must be committed as a whole - (see the discussion of function prototype's in “Forcing Data-types”), if + (see the discussion of function prototype's in Forcing Data-types), if no prototype existed previously, this action also causes variable annotations for all input parameters to be created as well. In this situation, the action is equivalent to - “Commit Params/Return”, and a confirmation dialog comes up to notify the user. + Commit Params/Return, and a confirmation dialog comes up to notify the user.
Setting a data-type on the return value using this action affects decompilation for the function itself and, additionally, any function that calls this function. Within a calling - function, the decompiler propagates the data-type into the variable or expression incorporating + function, the Decompiler propagates the data-type into the variable or expression incorporating the return value at each call site.
- The change to the data-type is forcing on the decompiler - (see “Forcing Data-types”). Decompilation is triggered again, and the new + The change to the data-type is forcing on the Decompiler + (see Forcing Data-types). Decompilation is triggered again, and the new data-type is propagated from the variable reference(s). Changes to the output may be large and indirect.
- Local variables and parameters presented by the decompiler may be invented on-the-fly + Local variables and parameters presented by the Decompiler may be invented on-the-fly and don't necessarily have a formal annotation in Ghidra - (see “Variable Annotations”). Performing this action on a variable + (see Variable Annotations). Performing this action on a variable will create an annotation if one didn't exist previously, which will generally be - visible as part of the function header in the Listing window. + visible as part of the function header in any Listing window.
Performing this action on a function parameter causes a formal annotation to @@ -1282,7 +1300,7 @@
Change the display of the integer or character constant under the cursor to an equate @@ -1300,17 +1318,17 @@ OK, completes the action, and the selected equate is substituted for its constant.
- This command primarily targets a constant token in the Decompiler window, but + This command primarily targets a constant token in a Decompiler window, but if there is a scalar operand in an instruction that corresponds - with the selected constant, the same equate is also applied to the scalar in the Listing + with the selected constant, the same equate is also applied to the scalar in any Listing window. This is equivalent to selecting the - Set Equate command from the - Listing. There may not be a scalar operand directly corresponding to the selected constant, in + Set Equate command from a + Listing window. There may not be a scalar operand directly corresponding to the selected constant, in which case the equate will be applied only in the Decompiler window.
Once an equate is applied, the constant can be returned to its default display - by selecting the “Remove Convert/Equate” command. + by selecting the Remove Convert/Equate command.
@@ -1323,11 +1341,11 @@ possible range.- The decompiler defines high-level variables in terms of varnodes that + The Decompiler defines high-level variables in terms of varnodes that are merged together to produce the final variable. Some merging is speculative, which reduces the number of variables overall, but is not strictly necessary for valid decompilation. The merged - variable can be represented with two or more variables that have a smaller range. See the - documentation on “HighVariable”. + variable can be represented with two or more variables that have a smaller range (see the + documentation on HighVariable).
This command is only available if the selected token is part of a high-level variable that has diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/images/document-properties.png b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/images/document-properties.png new file mode 100644 index 0000000000..ab0e8ea377 Binary files /dev/null and b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/images/document-properties.png differ diff --git a/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/images/openFolder.png b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/images/openFolder.png new file mode 100644 index 0000000000..14cf972f07 Binary files /dev/null and b/Ghidra/Features/Decompiler/src/main/help/help/topics/DecompilePlugin/images/openFolder.png differ