## [Math][Python] Cartesian product by example

This is the simplest explanation of cartesian product.

### Seeding a web crawler

You write a simple web crawler. Where to start? You pick most popular TLDs (com, net, org) and a list of English words. Combine them to get list of domains. This is cartesian product of three sets.

#!/usr/bin/env python3

import itertools

TLDs=["com", "net", "org"]
words=["apple", "banana", "orange"]
http=["http", "https"]

for _http, _word, _TLD in itertools.product(http, words, TLDs):
print (_http+"://"+_word+"."+_TLD)

http://apple.com
http://apple.net
http://apple.org
http://banana.com
http://banana.net
http://banana.org
http://orange.com
http://orange.net
http://orange.org
https://apple.com
https://apple.net
https://apple.org
https://banana.com
https://banana.net
https://banana.org
https://orange.com
https://orange.net
https://orange.org


Size of resulting set (domains) = TLDs * words * 2. This is why 'product'.

All this you can achieve with three nested for loops. And it may be more efficient! But calling itertools.product() is more readable and concise.

### Testing

Compile your code for various OS, arch, using two compiler flags (optimizations).

#!/usr/bin/env python3

import itertools

OS=["Windows", "macOS", "Linux"]
arch=["x86_64", "arm"]
compiler_flags=["O1", "O3"]

for _OS, _arch, _compiler_flags in itertools.product(OS, arch, compiler_flags):
print (_OS, _arch, _compiler_flags)

Windows x86_64 O1
Windows x86_64 O3
Windows arm O1
Windows arm O3
macOS x86_64 O1
macOS x86_64 O3
macOS arm O1
macOS arm O3
Linux x86_64 O1
Linux x86_64 O3
Linux arm O1
Linux arm O3


If you know that password can have only a/b/c/d characters. Password length is 3 characters.

>>> import itertools
>>> a=['a','b','c','d']
>>> list(itertools.product(a,a,a))
[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'd'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'c', 'a'), ('a', 'c', 'b'), ('a', 'c', 'c'), ('a', 'c', 'd'), ('a', 'd', 'a'), ('a', 'd', 'b'), ('a', 'd', 'c'), ('a', 'd', 'd'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'), ('b', 'a', 'd'), ('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'b', 'd'), ('b', 'c', 'a'), ('b', 'c', 'b'), ('b', 'c', 'c'), ('b', 'c', 'd'), ('b', 'd', 'a'), ('b', 'd', 'b'), ('b', 'd', 'c'), ('b', 'd', 'd'), ('c', 'a', 'a'), ('c', 'a', 'b'), ('c', 'a', 'c'), ('c', 'a', 'd'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'), ('c', 'b', 'd'), ('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c'), ('c', 'c', 'd'), ('c', 'd', 'a'), ('c', 'd', 'b'), ('c', 'd', 'c'), ('c', 'd', 'd'), ('d', 'a', 'a'), ('d', 'a', 'b'), ('d', 'a', 'c'), ('d', 'a', 'd'), ('d', 'b', 'a'), ('d', 'b', 'b'), ('d', 'b', 'c'), ('d', 'b', 'd'), ('d', 'c', 'a'), ('d', 'c', 'b'), ('d', 'c', 'c'), ('d', 'c', 'd'), ('d', 'd', 'a'), ('d', 'd', 'b'), ('d', 'd', 'c'), ('d', 'd', 'd')]


... or ...

>>> list(itertools.product(['a','b','c','d'], repeat=3))
[('a', 'a', 'a'), ('a', 'a', 'b'), ...


This code can be scaled.

What if a 3-characters password, where first character can be uppercase/lowercase, but rest is always lowercase?

>>> u=['A','a','b','B']
>>> l=['a','b']
>>> list(itertools.product(u+l,l,l))
[('A', 'a', 'a'), ('A', 'a', 'b'), ('A', 'b', 'a'), ('A', 'b', 'b'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'b', 'a'), ('b', 'b', 'b'), ('B', 'a', 'a'), ('B', 'a', 'b'), ('B', 'b', 'a'), ('B', 'b', 'b'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'b', 'a'), ('b', 'b', 'b')]


### Bash

Get list of year/months in shell:

% echo {2020..2022}-{01..12}

2020-01 2020-02 2020-03 2020-04 2020-05 2020-06 2020-07 2020-08 2020-09 2020-10
2020-11 2020-12 2021-01 2021-02 2021-03 2021-04 2021-05 2021-06 2021-07 2021-08
2021-09 2021-10 2021-11 2021-12 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06
2022-07 2022-08 2022-09 2022-10 2022-11 2022-12


Add day. (Some months are incorrect!)

% echo {2020..2022}-{01..12}-{01..31}
2020-01-01 2020-01-02 2020-01-03 2020-01-04 2020-01-05 2020-01-06 2020-01-07 2020-01-08
2020-01-09 2020-01-10 2020-01-11 2020-01-12 2020-01-13 2020-01-14 2020-01-15 2020-01-16
2020-01-17 2020-01-18 2020-01-19 2020-01-20 2020-01-21 2020-01-22 2020-01-23 2020-01-24
2020-01-25 2020-01-26 2020-01-27 2020-01-28 2020-01-29 2020-01-30 2020-01-31 2020-02-01
2020-02-02 2020-02-03 2020-02-04 2020-02-05 2020-02-06 2020-02-07 2020-02-08 2020-02-09
...


### SQL

Cartesian product (AKA cross product) is what you have in SQL as inner join.

Let's try MySQL.

create table YEAR
(
YEAR INT NOT NULL UNIQUE
);

create table MONTH
(
MONTH INT NOT NULL UNIQUE
);

insert into YEAR (YEAR) values (2020);
insert into YEAR (YEAR) values (2021);
insert into YEAR (YEAR) values (2022);

insert into MONTH (MONTH) values (1);
insert into MONTH (MONTH) values (2);
insert into MONTH (MONTH) values (3);
insert into MONTH (MONTH) values (4);
insert into MONTH (MONTH) values (5);
insert into MONTH (MONTH) values (6);
insert into MONTH (MONTH) values (7);
insert into MONTH (MONTH) values (8);
insert into MONTH (MONTH) values (9);
insert into MONTH (MONTH) values (10);
insert into MONTH (MONTH) values (11);
insert into MONTH (MONTH) values (12);

mysql> select * from YEAR;
+------+
| YEAR |
+------+
| 2020 |
| 2021 |
| 2022 |
+------+
3 rows in set (0.00 sec)

mysql> select * from MONTH;
+-------+
| MONTH |
+-------+
|     1 |
|     2 |
|     3 |
...
|    10 |
|    11 |
|    12 |
+-------+
12 rows in set (0.00 sec)

mysql> SELECT * FROM YEAR INNER JOIN MONTH;
+------+-------+
| YEAR | MONTH |
+------+-------+
| 2022 |     1 |
| 2021 |     1 |
| 2020 |     1 |
| 2022 |     2 |
| 2021 |     2 |
...
| 2021 |    11 |
| 2020 |    11 |
| 2022 |    12 |
| 2021 |    12 |
| 2020 |    12 |
+------+-------+
36 rows in set (0.00 sec)


###### (the post first published at 20221015.)

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.