Use of UNIX 'sed' Command in Anonymization
The UNIX 'sed' command helps to process the log files. The Regular expression in the configuration file is passed to sed, which, in turn, identifies the string that needs to be replaced or removed. If the string is unknown, for example the login name of a user, then you need to search for the string User Name and then extract the name of the user as the string to be replaced.
-
Identifying the complete string including the identifying context. This is achieved by using the
Regular expressionfield. -
Extracting just the substring that needs to be replaced or removed. This is achieved by using the
Grpfield.
Example 1
Consider the following line of the configuration file, which anonymizes the username.
UserName, ^.*User Name:\s*\(.*\)$, anonymize, 1
Here, the Regular expression is ^.*User Name:\s*\(.*\)$ and Grp is 1.
-
^: Matches the beginning of the line in the log file -
.*: Matches the zero or more arbitrary characters before the stringUser Name:that is, the string\#. -
User Name: Matches the stringUser Name:. -
\s*: Matches the whitespace between User Name: and the name of the user. For example,samanthak. -
\(.*\): The expression.*matches an arbitrary sequence of characters, such assamanthak. Since this value is referenced later, it requires escaped parenthesis. Each pair of the parentheses is assigned an increasing number. In this example, only one set is specified, so it is assigned the value1.
The Grp is set to 1 to indicate that this is the value that needs to be replaced.
Example 2
In this example, consider the same regular expression in multiple rules to identify and anonymize schematic library and cell names. For this, we can use a general replay command of the form function("libName" "cellName" "schematic") to identify the library and cell names. For example, the replay command for opening a cellview from the history list is:
\i cv = dbOpenCellViewByType("opamp090" "full_diff_opamp_AC" "schematic")
Now, create a regular expression to identify the argument list. So, for the following argument list:
"opamp090" "full_diff_opamp_AC" "schematic"
The regular expression is as follows:
^.*"\([^"]*\)"\s\+"\([^"]*\)"\s\+"schematic".*$
In this expression, the first two values in the list are used to identify the library and cell names.
The components of the regular expression are as follows:
-
^.*": Matches everything from the beginning of the line up to the first quote. -
\([^"]*\: Matches the string inside the quotes, which can be referenced asGrp 1. -
\s\+: Matches one or more whitespace characters. -
": Matches the opening and closing quote of the second string. -
\([^"]*\): Matches the string inside the second set of quotes, which can be referenced asGrp 2. -
"schematic": Matches the literal string"schematic". -
.*$: Matches the remainder of the line.
We can use this expression to create the following rules. These rules use the same regular expression, but use a different Grp to identify the value to anonymize:
-
Anonymizes the library name
^.*"\([^"]*\)"\s\+"\([^"]*\)"\s\+"schematic".*$, anonymize, 1
This rule extracts and anonymizes the library name opamp090. -
Anonymizes the cell name.
^.*"\([^"]*\)"\s\+"\([^"]*\)"\s\+"schematic".*$, anonymize, 2
This rule extracts and anonymizes the cell namefull_diff_opamp_AC.
Conditions for Anonymization
For anonymization to work, certain conditions must be met. If anonymization fails, diagnostic logs are not submitted automatically.
For anonymization to work, ensure that:
-
The
/bin/awk,/bin/sed,/usr/bin/id, and/usr/bin/opensslfiles exist and are executable. These files are needed to perform anonymization. -
The
CDS.logorvailsAnonymizeLog.cfgfile exists and is readable. -
There are no syntax issues in the
vailsAnonymizeLog.cfgfile. This means that:
Related Topics
Use of UNIX 'sed' Command in Anonymization
Return to top