[Math][Python] Cartesian product by example

This is the simplest explanation of cartesian product.

Seeding a web crawler

You write a simple web crawler. Where to start? You pick most popular TLDs (com, net, org) and a list of English words. Combine them to get list of domains. This is cartesian product of three sets.

#!/usr/bin/env python3

import itertools

TLDs=["com", "net", "org"]
words=["apple", "banana", "orange"]
http=["http", "https"]

for _http, _word, _TLD in itertools.product(http, words, TLDs):
    print (_http+"://"+_word+"."+_TLD)
http://apple.com
http://apple.net
http://apple.org
http://banana.com
http://banana.net
http://banana.org
http://orange.com
http://orange.net
http://orange.org
https://apple.com
https://apple.net
https://apple.org
https://banana.com
https://banana.net
https://banana.org
https://orange.com
https://orange.net
https://orange.org

Size of resulting set (domains) = TLDs * words * 2. This is why 'product'.

All this you can achieve with three nested for loops. And it may be more efficient! But calling itertools.product() is more readable and concise.

Testing

Compile your code for various OS, arch, using two compiler flags (optimizations).

#!/usr/bin/env python3

import itertools

OS=["Windows", "macOS", "Linux"]
arch=["x86_64", "arm"]
compiler_flags=["O1", "O3"]

for _OS, _arch, _compiler_flags in itertools.product(OS, arch, compiler_flags):
    print (_OS, _arch, _compiler_flags)
Windows x86_64 O1
Windows x86_64 O3
Windows arm O1
Windows arm O3
macOS x86_64 O1
macOS x86_64 O3
macOS arm O1
macOS arm O3
Linux x86_64 O1
Linux x86_64 O3
Linux arm O1
Linux arm O3

Passwords cracking

If you know that password can have only a/b/c/d characters. Password length is 3 characters.

>>> import itertools
>>> a=['a','b','c','d']
>>> list(itertools.product(a,a,a))
[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'd'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'c', 'a'), ('a', 'c', 'b'), ('a', 'c', 'c'), ('a', 'c', 'd'), ('a', 'd', 'a'), ('a', 'd', 'b'), ('a', 'd', 'c'), ('a', 'd', 'd'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'), ('b', 'a', 'd'), ('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'b', 'd'), ('b', 'c', 'a'), ('b', 'c', 'b'), ('b', 'c', 'c'), ('b', 'c', 'd'), ('b', 'd', 'a'), ('b', 'd', 'b'), ('b', 'd', 'c'), ('b', 'd', 'd'), ('c', 'a', 'a'), ('c', 'a', 'b'), ('c', 'a', 'c'), ('c', 'a', 'd'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'), ('c', 'b', 'd'), ('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c'), ('c', 'c', 'd'), ('c', 'd', 'a'), ('c', 'd', 'b'), ('c', 'd', 'c'), ('c', 'd', 'd'), ('d', 'a', 'a'), ('d', 'a', 'b'), ('d', 'a', 'c'), ('d', 'a', 'd'), ('d', 'b', 'a'), ('d', 'b', 'b'), ('d', 'b', 'c'), ('d', 'b', 'd'), ('d', 'c', 'a'), ('d', 'c', 'b'), ('d', 'c', 'c'), ('d', 'c', 'd'), ('d', 'd', 'a'), ('d', 'd', 'b'), ('d', 'd', 'c'), ('d', 'd', 'd')]

... or ...

>>> list(itertools.product(['a','b','c','d'], repeat=3))
[('a', 'a', 'a'), ('a', 'a', 'b'), ...

This code can be scaled.

What if a 3-characters password, where first character can be uppercase/lowercase, but rest is always lowercase?

>>> u=['A','a','b','B']
>>> l=['a','b']
>>> list(itertools.product(u+l,l,l))
[('A', 'a', 'a'), ('A', 'a', 'b'), ('A', 'b', 'a'), ('A', 'b', 'b'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'b', 'a'), ('b', 'b', 'b'), ('B', 'a', 'a'), ('B', 'a', 'b'), ('B', 'b', 'a'), ('B', 'b', 'b'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'b', 'a'), ('b', 'b', 'b')]

Other ideas about password cracking with itertools, you can find here.

Bash

Get list of year/months in shell:

% echo {2020..2022}-{01..12}

2020-01 2020-02 2020-03 2020-04 2020-05 2020-06 2020-07 2020-08 2020-09 2020-10
2020-11 2020-12 2021-01 2021-02 2021-03 2021-04 2021-05 2021-06 2021-07 2021-08
2021-09 2021-10 2021-11 2021-12 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06
2022-07 2022-08 2022-09 2022-10 2022-11 2022-12

Add day. (Some months are incorrect!)

% echo {2020..2022}-{01..12}-{01..31}
2020-01-01 2020-01-02 2020-01-03 2020-01-04 2020-01-05 2020-01-06 2020-01-07 2020-01-08
2020-01-09 2020-01-10 2020-01-11 2020-01-12 2020-01-13 2020-01-14 2020-01-15 2020-01-16
2020-01-17 2020-01-18 2020-01-19 2020-01-20 2020-01-21 2020-01-22 2020-01-23 2020-01-24
2020-01-25 2020-01-26 2020-01-27 2020-01-28 2020-01-29 2020-01-30 2020-01-31 2020-02-01
2020-02-02 2020-02-03 2020-02-04 2020-02-05 2020-02-06 2020-02-07 2020-02-08 2020-02-09
...

SQL

Cartesian product (AKA cross product) is what you have in SQL as inner join.

Let's try MySQL.

INNER JOIN and , (comma) are semantically equivalent in the absence of a join condition: both produce a Cartesian product between the specified tables (that is, each and every row in the first table is joined to each and every row in the second table).

create table `YEAR`
(
`YEAR` INT NOT NULL UNIQUE
);

create table `MONTH`
(
`MONTH` INT NOT NULL UNIQUE
);

insert into YEAR (YEAR) values (2020);
insert into YEAR (YEAR) values (2021);
insert into YEAR (YEAR) values (2022);

insert into MONTH (MONTH) values (1);
insert into MONTH (MONTH) values (2);
insert into MONTH (MONTH) values (3);
insert into MONTH (MONTH) values (4);
insert into MONTH (MONTH) values (5);
insert into MONTH (MONTH) values (6);
insert into MONTH (MONTH) values (7);
insert into MONTH (MONTH) values (8);
insert into MONTH (MONTH) values (9);
insert into MONTH (MONTH) values (10);
insert into MONTH (MONTH) values (11);
insert into MONTH (MONTH) values (12);
mysql> select * from YEAR;
+------+
| YEAR |
+------+
| 2020 |
| 2021 |
| 2022 |
+------+
3 rows in set (0.00 sec)

mysql> select * from MONTH;
+-------+
| MONTH |
+-------+
|     1 |
|     2 |
|     3 |
...
|    10 |
|    11 |
|    12 |
+-------+
12 rows in set (0.00 sec)

mysql> SELECT * FROM YEAR INNER JOIN MONTH;
+------+-------+
| YEAR | MONTH |
+------+-------+
| 2022 |     1 |
| 2021 |     1 |
| 2020 |     1 |
| 2022 |     2 |
| 2021 |     2 |
...
| 2021 |    11 |
| 2020 |    11 |
| 2022 |    12 |
| 2021 |    12 |
| 2020 |    12 |
+------+-------+
36 rows in set (0.00 sec)

As seen at Reddit.

(the post first published at 20221015.)


List of my other blog posts.

Subscribe to my news feed

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.