I have recently been studying CodeQL, and there is no need to introduce CodeQL here, as there are plenty of resources on the internet. This series is a personal learning note of CodeQL, modified and organized based on my personal knowledge base notes, and shared for joint learning. Personally, I think the QL syntax is quite counterintuitive, at least compared to the mainstream OOP languages currently, it still has a certain degree of difficulty. Unlike most so-called CodeQL tutorials on the internet now, this series is based onofficial documentationandScenario instancewhich includes a lot of personal understanding, thinking, and extension, going straight to the point, cutting to the chase, almost without any fluff, and insists on learning and summarizing from each instance, and then verifying in the instance. I hope to give everyone a different perspective and way of thinking. Of course, as such, there must be some errors, and I hope that everyone can leave comments in the comments section to correct them.
Let's take a look at some basic concepts and structures first

// Basic structure
from /* variable declarations */
where /* logical formulas */
select /* expressions */
from int a, int b
where x = 3, y = 4
select x, y
// Find Pythagorean triples within 1-10
from int x, int y, int z
where x in [1..10], y in [1..10], z in [1..10] and
x * x + y * y = z * z
select x, y, z
// Or the following class writing style, encapsulation and method reuse
class SmallInt extends int {
SmallInt() {
this is in [1..10]
}
int square(){}
result = this * this
}
}
from SmallInt x, SmallInt y, SmallInt z
where x.sqrt() + y.square() = z.sqrt()
select x, y, z
Logical connectives, quantifiers, aggregation words
from Person p
where p.getAge() = max(int i | exists(Person t | t.getAge() = i ) | i) // General aggregation syntax, somewhat verbose
select p
// Or use the following ordered aggregation method
select max(Person p | | p order by p.getAge())
exists(<variable declaration> | <condition expression>)
<aggregates> ( <variable declaration> | <logical expression(restricting the range of data that meets the conditions)> | <expression(returning the filtered)> )
e.g.
exists( Person p | p.getName() = "test" )
Determine if there is a person named test
max(int i | exists(Person p | p.getAge() = i) | i)
The second part means to put all people's ages into i, and the third part is the scope of action for i. Currently, i is an int array, storing all people's ages, so the final calculation ismax(i)
select max(Person p | | p order by p.getAge())
Consider each person, and take the person with the maximum age. The process is to take the maximum value by age, in other words,order by p.getAge()
Tells the max() function to find the maximum value based on getAge(), which does not involve sorting all objects.
// Other ordered aggregation exercises
select min(Person p | p.getLocation() = "east" | p order by p.getHeight()) // The shortest person in the east of the village
select count(Person p | p.getLocation() = "south" | p) // Number of people in the south of the village
select avg(Person p | | p.getHeight()) // Average height of villagers
select sum(Person p | p.getHairColor() = "brown" | p.getAge()) // Total age sum of all villagers with brown hair
// Comprehensive practice, https://codeql.github.com/docs/writing-codeql-queries/find-the-thief/#the-real-investigation
import tutorial
from Person p
where
p.getHeight() > 150 and // Height is over 150
not p.getHairColor() = "blond" and // Hair color is not blond
exists(string c | p.getHairColor() = c) and // Not bald. This indicates that this person has some hair color, but does not concretize it
not p.getAge() < 30 and // Age is 30 or over. It can also be p.getAge() >= 30
p.getLocation() = "east" and // Lives in the east
( p.getHairColor() = "black" or p.getHairColor() = "brown" ) and // Hair is black or brown
not (p.getHeight() > 180 and p.getHeight() < 190) and // Not (over 180 and under 190)
exists(Person t | t.getAge() > p.getAge()) and // Not the oldest person. This uses existential syntax to indicate that there is someone older than him
exists(Person t | t.getHeight() > p.getHeight()) and // Not the tallest person
p.getHeight() < avg(Person t | | t.getHeight()) and // Shorter than the average height. Applies to all people without any restrictions
p = max(Person t | t.getLocation() = "east" | t order by t.getAge()) // The person with the highest age in the east. This line is the official reference, but the official documentation states "Note that if there are several people with the same maximum age, the query lists all of them.". If there are two people with the same maximum age, it may cause uncontrollable consequences.
// p.getAge() = max(Person t | t.getLocation() = "east" | t.getAge()) // According to personal understanding and the answer of ChatGPT, this method should be used
select p
Predicate and category
The concept of predicate in CodeQL can be roughly understood as functions in other advanced programming languages, and it also has characteristics such as parameters, return values, and reusability
Let's take a simple example
import tutorial
predicate isSouthern(Person p) {
p.getLocation() = "south"
}
from Person p
where isSouthern(p)
select p
Here, the predicate is a logical condition judgment that returns true or false, which is somewhat similar to boolean, of course, ql has a separate boolean type, there is still a certain difference, but it can be understood in terms of understanding, here we will not expand
The definition method of the predicate is similar to that of a function, where the predicate can be replaced with the return type, for exampleint getAge() { result = xxx }
The predicate name can only start with a lowercase letter
In addition, a new class can be defined directly containing people who are isSouthern
class Southerner extends Person {
Southerner() { isSouthern(this) }
}
from Southerner s
select s
Here, it is similar to class definition in object-oriented languages (OOL), and it also has inheritance, encapsulation, methods, etc.; here
Southerner()
Similar to a constructor, but unlike the constructor in the class, here it is a logical property that will not create an object. Methods in the ool class are called class member predicates in qlexpression
isSouthern(this)
defines the logical properties of this class, known asfeature predicate
He uses a variablethis
Here, this is understood the same as in ool, indicating that if the attributeisSouthern(this)
is established, then onePerson
--this
isSoutherner
A simple understanding is that each feature predicate of the inherited subclass in ql representsWhat kind of superclass is this subclass
,What features/characteristics do I have as a subclass based on the superclass
Quotation from the official document: In QL, a class represents a logical attribute: when a value satisfies this attribute, it becomes a member of this class. This means that a value can belong to many classes — belonging to a specific class does not prevent it from belonging to other classes.
Let's look at the following example
class Child extends Person {
Child(){
this.getAge() < 10
}
override predicate isAllowedIn(string region) {
region = this.getLocation()
}
}
// The implementation of isAllowedIn in the superclass Person is as follows:
predicate isAllowedIn(string region) { region = ["north", "south", "east", "west"] }
// The isAllowedIn(region) method of the superclass Person always returns true, while the subclass returns true only for the current region (getLocation() method)
Let's look at a complete example
import tutorial
predicate isSoutherner(Person p) {
p.getLocation() = "south"
}
class Southerner extends Person {
Southerner(){isSoutherner(this)}
}
class Child extends Person {
Child(){this.getAge() < 10}
override predicate isAllowedIn(string region) {
region = this.getLocation()
}
}
from Southerner s
where s.isAllowedIn("north")
select s, s.getAge()
There is a very important concept here, which should be distinguished from the classes in ool. In the classes of ool, the refactored methods in the inherited subclasses do not affect other inherited subclasses, and each subclass does not need to consider whether there is an intersection. However, in QL, as the official document says:In QL, a class represents a logical attribute: when a value satisfies this attribute, it becomes a member of this class. This means that a value can belong to many classes — belonging to a specific class does not prevent it from belonging to other classes.
In each subclass of ql, as long as it satisfies its feature predicate, it is a member of this subclass.
For the specific example in the code above, if there is a person in Person who simultaneously satisfies the feature relationship of Southerner and Child, then he belongs to both of these classes, and naturally inherits the member predicates in them.
Personal understanding, in fact, the subclass in QL is to take all the elements of the superclass, then match some elements in the superclass according to the feature predicate, and then rewrite/refactor the member predicates of these elements. In fact, it is to modify the elements in the superclass. Below are three examples to illustrate the understanding.
// Extract the current South from all Person, and then extract those who can go to north. Because child is limited to stay locally, so the Child in this part of Southerner extracted cannot go to north, so this part (originally in South) Child is filtered out
from Southerner s
where s.isAllowedIn("north")
select s
// Extract all Child, so they can only stay in place, so to find who can go to north is to find who originally stays in north
from Child c
where c.isAllowedIn("north")
select c
// Extract all Person, to find who can go to north, that is, to find all adults (by default, everyone can go to all areas) and to find Child who is originally in north
from Person p
where p.isAllowedIn("north")
select p
To extend, if multiple subclasses simultaneously refactor and override the same member predicate, then follow the following rules (assuming there are three classes A, B, C) (there is a summary later):
Assuming A is the superclass, that is, one of the member predicates in it
test()
There is no override, B and C inherit A simultaneously, and both override A'stest()
member predicate.
If the predicate type of from is A, then within
test()
The method will be rewritten by B and C completely. When encountering the overlapping part of B and C, there is no conflict, and they coexistIf the predicate type of from is B or C, then based on B/C, add the overlapping part under the condition of satisfying B/C, without conflict, and coexist
If A is the superclass, B inherits A, and C inherits B, then C will override the same member predicate in B instead of coexisting
For multiple inheritance, C inherits A and B simultaneously. If there is an overlap in the member predicates of A and B, then C must override this predicate
For example:
class OneTwoThree extends int { OneTwoThree() { // Feature predicate this = 1 or this = 2 or this = 3 } string getAString() { // Member predicate result = "One, two or three: " + this.toString() } } class OneTwo extends OneTwoThree { OneTwo() { this = 1 or this = 2 } override string getAString() { result = "One or two: " + this.toString() } } from OneTwoThree o select o, o.getAString() /* result: o getAString() result 1 One or two: 1 2 One or two: 2 3 One, two or three: 3 // Understanding: The onetwothree class defines 1, 2, 3, and onetwo refactors the member predicates of 1 and 2 in onetwothree. Therefore, in onetwothree o, there are 3, of which 1 and 2 use the member predicates of onetwo, and 3 uses the member predicates of onetwothree. */
Scenario 1: On this basis, add another category (important), A->B, A->C
class TwoThree extends OneTwoThree{ TwoThree() { this = 2 or this = 3 } override string getAString() { result = "Two or three: " + this.toString() } } /* command: from OneTwoThree o select o, o.getAString() result: o getAString() result 1 One or two: 1 2 One or two: 2 2 Two or three: 2 3 Two or three: 3 // Understanding: Twothree and onetwo overlap in two, but unlike other ool, ql will not conflict but coexist. --- command: from OneTwo o select o, o.getAString() result: 1 One or two: 1 2 One or two: 2 2 Two or three: 2 // Understanding: Both twothree and onetwo have refactored 2 in it, as ql will not conflict, so they coexist. Since the type of o is onetwo, the 'foundation' is 1 and 2, then plus the refactored 2 by twothree. --- command: from TwoThree o select o, o.getAString() result: 2 One or two: 2 2 Two or three: 2 3 Two or three: 3 // Understanding: Both twothree and onetwo have refactored 2, as ql will not conflict, they will coexist. Since the type of o is twothree, the 'foundation' is 2 and 3, then plus the refactored 2 by onetwo. */
Scenario 2: A->B->C (inheritance chain)
class Two extends TwoThree { Two() { this = 2 } override string getAString() { result = "Two: " + this.toString() } } from TwoThree o select o, o.getAString() /* result: o getAString() result 1 One or two: 2 2 Two: 2 3 Two or three: 3 // Understanding: Based on the above examples, Two has refactored the member predicates in twothree, so it is not a coexisting relationship with twothree. */ from OneTwo o select o, o.getAString() /* result: o getAString() result 1 One or two: 1 2 One or two: 2 3 Two: 2 // Understanding: Based on the previous examples, OneTwo and TwoThree coexist, but Two overrides part of TwoThree (i.e., OneTwo and TwoThree are not coexisting relationships). */
Summary of stage: Based on the learning of so many examples above, the summary and induction are actually very simple. The core essence is to clarify the 'inheritance chain relationship'. If two categories inherit the same parent class, then their results coexist; if two classes are subordinate to each other (parent and child), then the subclass overrides the corresponding part of the parent class.
For example, in the above example, OneTwo and TwoThree are coexisting relationships, inheriting OneTwoThree simultaneously, so their results coexist without conflict; TwoThree and Two are subordinate relationships, so according to the principle of most derived class priority, the content of TwoThree corresponding to Two is overridden (Two also indirectly inherits OneTwoThree, so it affects all parent classes including OneTwoThree).
Situation 3: Multiple Inheritance
class Two extends OneTwo, TwoThree { Two() { this = 2 } override string getAString() { result = "Two: " + this.toString() } } // Explanation 1: Two inherits from TwoThree and OneTwo simultaneously. If no conditional predicate is written, it defaults to satisfying the conditions of both parent classes at the same time. If written, the scope must also be less than or equal to this intersection scope. // Explanation 2: If there are multiple definitions of a member predicate with the same name in the parent classes of multiple inheritance, these definitions must be overridden to avoid ambiguity. The getAString() of Two here cannot be omitted from OneTwoThree o select o, o.getAString() /* result: o getAString() result 1 One or two: 1 2 Two: 2 3 Two or three: 3 // Understanding: Since two is a parent of onetwo and twothree, the common two is directly overwritten, not coexisting */
Based on this, create a predicate to determine if someone is bald isBald
predicate isBald(Person p) {
not exists(string c | p.getHairColor() = c) // Not adding not indicates that someone has hair
}
// Obtain the final result, the southern bald allowed to enter the north
from Southerner s
where s.isAllowedIn("north") and isBald(s)
select s, s.getAge()

评论已关闭