Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kshitijk4poor/main #1304

Open
wants to merge 66 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
ba4d38e
dns_resolver
kshitijk4poor Jul 12, 2024
1b370bb
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Jul 12, 2024
a578fb4
linting
kshitijk4poor Jul 13, 2024
52098dc
lookup for NS and A records
kshitijk4poor Jul 17, 2024
4852348
fixed lint
kshitijk4poor Jul 17, 2024
bd5f25a
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Jul 18, 2024
1c7aa6d
yeahhh
kshitijk4poor Jul 22, 2024
5876115
added task
kshitijk4poor Jul 22, 2024
cd7d4c0
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Jul 27, 2024
9bac8ef
passing pre-commit
kshitijk4poor Jul 29, 2024
1101e33
updated logic
kshitijk4poor Jul 29, 2024
15d32cd
fixed whitespace-trail
kshitijk4poor Jul 29, 2024
d70c894
should fix lint
kshitijk4poor Jul 29, 2024
57d4a4e
added test
kshitijk4poor Jul 29, 2024
e255f09
implemented the module as a function
kshitijk4poor Aug 2, 2024
23bd878
cleanup
kshitijk4poor Aug 2, 2024
c9fd4d6
lint
kshitijk4poor Aug 2, 2024
7ce1ecd
fixed
kshitijk4poor Aug 2, 2024
78ee50c
using the payload
kshitijk4poor Aug 2, 2024
481a9d7
fixed param
kshitijk4poor Aug 2, 2024
e5dcdfe
fixed context
kshitijk4poor Aug 2, 2024
9c7b86b
update
kshitijk4poor Aug 2, 2024
480d432
fixed
kshitijk4poor Aug 3, 2024
a6cfe1a
lint pass
kshitijk4poor Aug 3, 2024
c887da8
finally working
kshitijk4poor Aug 3, 2024
e4678e2
fixed with better error handling
kshitijk4poor Aug 3, 2024
7b08d3c
better logging and documentation
kshitijk4poor Aug 3, 2024
01dbe78
.
kshitijk4poor Aug 3, 2024
0e969ef
fixed triling-whitespaces and end-of-file for docker-compose.yaml
kshitijk4poor Aug 3, 2024
bdf4c06
fixed test
kshitijk4poor Aug 5, 2024
2feb681
lint
kshitijk4poor Aug 5, 2024
dc01949
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Aug 5, 2024
dc356d4
fixed return
kshitijk4poor Aug 9, 2024
594aecd
lint
kshitijk4poor Aug 9, 2024
da27aa4
fix
kshitijk4poor Aug 9, 2024
a0330a5
lint
kshitijk4poor Aug 9, 2024
1977c41
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Aug 9, 2024
a81a1db
fixed
kshitijk4poor Aug 13, 2024
9003dc2
lint
kshitijk4poor Aug 13, 2024
26e65f3
lint
kshitijk4poor Aug 14, 2024
5fb5944
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Aug 21, 2024
4fbec0a
fix
kazet Sep 12, 2024
13bb41f
.
kazet Sep 12, 2024
808fe37
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Sep 19, 2024
60cda04
Merge
kazet Oct 2, 2024
6c7dd56
fix
kazet Oct 2, 2024
2e74ce0
Merge branch 'CERT-Polska:main' into main
kshitijk4poor Oct 3, 2024
32ed0f3
Placeholder scanning pages
michalkrzem Oct 4, 2024
b113b69
lint
michalkrzem Oct 4, 2024
7beb989
.
michalkrzem Oct 4, 2024
6e3cb3e
After revie - firs example by home.pl
michalkrzem Oct 7, 2024
9885e04
After the review - without bs4. Analyzing blank pages of hosting prov…
michalkrzem Oct 7, 2024
a6d3383
Except - RequestException in check_response()
michalkrzem Oct 7, 2024
9e60184
.
michalkrzem Oct 7, 2024
5960034
Pages with a domain but doesn't exist
michalkrzem Oct 9, 2024
bf1c907
lint
michalkrzem Oct 9, 2024
c80a73b
lint
michalkrzem Oct 9, 2024
1c9cb7f
After review - placeholder page content in file. Placeholder detector…
michalkrzem Oct 16, 2024
41c873d
lint
michalkrzem Oct 16, 2024
2f74692
lint
michalkrzem Oct 16, 2024
100b141
After review
michalkrzem Oct 21, 2024
5b8b8ee
config param name
michalkrzem Oct 21, 2024
731ef53
placeholder page content without def
michalkrzem Oct 21, 2024
16496fa
placeholder content file into env.test
michalkrzem Oct 21, 2024
8e7e98f
placeholder content file into test/data
michalkrzem Oct 21, 2024
bf55b4e
.
michalkrzem Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions artemis/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -698,6 +698,28 @@ class Nuclei:
"NUCLEI_TEMPLATE_CHUNK_SIZE is 200, three calls will be made with 200 templates each.",
] = get_config("NUCLEI_TEMPLATE_CHUNK_SIZE", default=200, cast=int)

class PlaceholderPageContent:
ENABLE_PLACEHOLDER_PAGE_DETECTOR: Annotated[
bool,
"Enable or disable placeholder pages detector. Using this feature you may skip vulnerability scanning "
"for websites that aren't built yet, but e.g. contain a hosting provider placeholder page. "
"If the page exists and the specified string is found within it, the page will not be scanned for "
"vulnerabilities. If the page is not marked as a placeholder, a full scan will be performed.",
] = get_config(
"ENABLE_PLACEHOLDER_PAGE_DETECTOR",
default=False,
cast=bool,
)
PLACEHOLDER_PAGE_CONTENT_FILENAME: Annotated[
str,
"Path to placeholder page content file. The file is divided into lines – each line is a string "
"containing a different HTML code element to check.",
] = get_config(
"PLACEHOLDER_PAGE_CONTENT_FILENAME",
default="/opt/artemis/modules/data/placeholder_page_content.txt",
cast=str,
)

class PortScanner:
PORT_SCANNER_PORT_LIST: Annotated[str, "Chosen list of ports to scan (can be 'short' or 'long')"] = (
get_config("PORT_SCANNER_PORT_LIST", default="short")
Expand Down
6 changes: 6 additions & 0 deletions artemis/module_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from artemis.config import Config
from artemis.db import DB
from artemis.domains import is_domain
from artemis.placeholder_page_detector import PlaceholderPageDetector
from artemis.redis_cache import RedisCache
from artemis.resolvers import NoAnswer, ResolutionException, lookup
from artemis.resource_lock import FailedToAcquireLockException, ResourceLock
Expand Down Expand Up @@ -153,6 +154,11 @@ def check_domain_exists(self, domain: str) -> bool:
bool: True if the domain exists, False otherwise.
"""
try:
if Config.Modules.PlaceholderPageContent.ENABLE_PLACEHOLDER_PAGE_DETECTOR:
placeholder_page = PlaceholderPageDetector()
if placeholder_page.is_placeholder(domain):
return False

# Check for NS records
try:
ns_records = lookup(domain, "NS")
Expand Down
12 changes: 12 additions & 0 deletions artemis/modules/data/placeholder_page_content.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<meta name="description" content="Strona pozostała na serwerze home.pl" />
<title>Domena newkf.nazwa.pl pozostaje na serwerze nazwa.pl</title>
<title>Tanie domeny, Tani hosting, Helpdesk, Pomoc zdalna - NetStrefa.pl</title>
<title qtlid="74178">Strona w budowie</title>
<title qtlid="74178">Miejsce w budowie</title>
<meta name="description" content="Numer 1 w polskim hostingu. Domeny, serwery, konta e-mail. Jakość potwierdzona certyfikatem ISO 9001:2000" />
<title>LOGONET Sp. z o.o. [C]</title>
<HTML><HEAD><TITLE>HostedWindows.pl</TITLE>
<BR>Witaj w serwisie <A class=link href="http://hostedwindows.pl/"><B>
<title>Cyber_Folks Lepsza obsługa i wsparcie bez porównania</title>
<meta name="description" content="Strona utrzymywana na serwerach home.pl" />
<title>Tanie domeny, Tani hosting, Helpdesk, Pomoc zdalna - NetStrefa.pl</title>
42 changes: 42 additions & 0 deletions artemis/placeholder_page_detector.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from typing import Any

import requests

from artemis import http_requests
from artemis.config import Config

PLACEHOLDER_PAGE_CONTENT_FILENAME = Config.Modules.PlaceholderPageContent.PLACEHOLDER_PAGE_CONTENT_FILENAME


PLACEHOLDER_PAGE_CONTENT = []
with open(PLACEHOLDER_PAGE_CONTENT_FILENAME, "r", encoding="utf-8") as file:
for keyword in file:
PLACEHOLDER_PAGE_CONTENT.append(keyword)


class PlaceholderPageDetector:
def __init__(self) -> None:
self.placeholder_content = PLACEHOLDER_PAGE_CONTENT

@staticmethod
def check_response(domain: str) -> Any:
url = "http://" + domain
try:
response = http_requests.get(url)
except requests.RequestException:
url = "https://" + domain
try:
response = http_requests.get(url)
except requests.RequestException:
return False
# response.encoding = "utf-8"
return response

def is_placeholder(self, domain: str) -> bool:
response = self.check_response(domain)
if response:
html_content = response.content
for keywords in self.placeholder_content:
if keywords.strip() in html_content:
return True
return False
1 change: 1 addition & 0 deletions env.test
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
DB_CONN_STR=
REDIS_CONN_STR=redis://test-redis:6379/1
POSTGRES_CONN_STR=postgresql://postgres:postgres@postgres-test/artemis
PLACEHOLDER_PAGE_CONTENT_FILENAME=/opt/test/data/test_placeholder_page_content.txt
12 changes: 12 additions & 0 deletions test/data/test_placeholder_page_content.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<meta name="description" content="Strona pozostała na serwerze home.pl" />
<title>Domena newkf.nazwa.pl pozostaje na serwerze nazwa.pl</title>
<title>Tanie domeny, Tani hosting, Helpdesk, Pomoc zdalna - NetStrefa.pl</title>
<title qtlid="74178">Strona w budowie</title>
<title qtlid="74178">Miejsce w budowie</title>
<meta name="description" content="Numer 1 w polskim hostingu. Domeny, serwery, konta e-mail. Jakość potwierdzona certyfikatem ISO 9001:2000" />
<title>LOGONET Sp. z o.o. [C]</title>
<HTML><HEAD><TITLE>HostedWindows.pl</TITLE>
<BR>Witaj w serwisie <A class=link href="http://hostedwindows.pl/"><B>
<title>Cyber_Folks Lepsza obsługa i wsparcie bez porównania</title>
<meta name="description" content="Strona utrzymywana na serwerach home.pl" />
<title>Tanie domeny, Tani hosting, Helpdesk, Pomoc zdalna - NetStrefa.pl</title>
Loading